Generative Adversarial Networks (GANs) Implementation for EMNIST Letter Generation¶

Part A: Data Analysis and Vanilla GAN Development¶

GAN CA2 PART A¶

AUTHORED BY SHEN LEI AND XAVIER LEE¶

DAAA/FT/2B/22¶

7/14/2025¶

Project Overview¶

This notebook presents a comprehensive implementation and analysis of Generative Adversarial Networks (GANs) applied to the Extended Modified National Institute of Standards and Technology (EMNIST) Letters dataset. The project demonstrates the application of deep learning techniques for generating synthetic handwritten letter images through adversarial training.

Project Objectives¶

  1. Data Exploration and Analysis: Comprehensive exploratory data analysis of the EMNIST Letters dataset
  2. Data Preprocessing: Implementation of advanced preprocessing techniques for optimal GAN training
  3. GAN Architecture Design: Development of generator and discriminator networks optimized for letter generation
  4. Model Training and Evaluation: Training GANs with proper monitoring and evaluation metrics
  5. Performance Assessment: Quantitative and qualitative evaluation of generated letter quality

Authors and Academic Context¶

  • Authors: Shen Lei & Xavier Lee
  • Course: Data Analytics and Algorithms (DAAA)
  • Program: Full-Time Diploma, Class 2B/22
  • Date: July 14, 2025

Dataset Information¶

The EMNIST Letters dataset is an extension of the famous MNIST database, containing handwritten alphabetic characters. It provides a more challenging and diverse set of character recognition tasks compared to standard digit datasets, making it ideal for evaluating generative model capabilities.

Technical Framework¶

This implementation leverages TensorFlow 2.x and Keras for deep learning model development, with comprehensive data analysis using NumPy, Pandas, and scientific visualization libraries.

1. Environment Setup and Dependencies¶

Library Import Strategy¶

This section imports all necessary libraries organized by functionality to ensure reproducible results and optimal performance. The imports are structured hierarchically to facilitate maintenance and debugging.

Key Library Categories:¶

  • Core Scientific Computing: NumPy, Pandas, SciPy for mathematical operations and data manipulation
  • Deep Learning Framework: TensorFlow 2.x ecosystem including Keras for neural network implementation
  • Visualization: Matplotlib, Seaborn for comprehensive data visualization and result presentation
  • Data Analysis: Scikit-learn for preprocessing, dimensionality reduction, and evaluation metrics
  • Utility Libraries: Progress tracking, I/O operations, and image processing capabilities

Performance Considerations:¶

  • TensorFlow backend configuration for optimal GPU utilization
  • Random seed initialization for reproducible experiments
  • Memory-efficient data loading and processing strategies
In [ ]:
# ==============================
# ENVIRONMENT SETUP
# ==============================
import os
os.environ["KERAS_BACKEND"] = "tensorflow"  # Keep if swapping to Keras 3 later

# ==============================
# STANDARD LIBRARIES
# ==============================
import time
import random
import string
import datetime
from collections import defaultdict

# ==============================
# CORE SCIENTIFIC STACK
# ==============================
import numpy as np
import pandas as pd
from scipy.linalg import sqrtm

# ==============================
# VISUALIZATION
# ==============================
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
import seaborn as sns

# Optional: model diagrams
try:
    import visualkeras
except ImportError:
    visualkeras = None

# ==============================
# PROGRESS / DISPLAY / I/O
# ==============================
from tqdm import tqdm
from IPython import display
import imageio.v2 as imageio  # Avoids deprecation warnings

# ==============================
# TENSORFLOW ECOSYSTEM
# ==============================
import tensorflow as tf
# Optional: Addons
try:
    import tensorflow_addons as tfa
except ImportError:
    tfa = None

# ---- tf.keras: layers / models / utils
from tensorflow.keras.layers import (
    Dense, Reshape, Flatten, Dropout, BatchNormalization, LeakyReLU,
    Conv2DTranspose, Conv2D, Embedding, Concatenate, Input,
    ReLU, Activation, ZeroPadding2D, Add, GlobalAveragePooling2D,
    MaxPooling2D, UpSampling2D, Lambda
)
from tensorflow.keras.models import Model, Sequential, load_model
from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping, Callback
from tensorflow.keras.losses import (
    SparseCategoricalCrossentropy, CategoricalCrossentropy, BinaryCrossentropy
)
from tensorflow.keras.utils import to_categorical, Progbar
from tensorflow.keras.optimizers import Adam

# ==============================
# SKLEARN UTILITIES
# ==============================
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, StandardScaler

2. Dataset Import and Initial Analysis¶

EMNIST Letters Dataset Overview¶

The Extended Modified National Institute of Standards and Technology (EMNIST) Letters dataset represents a significant advancement over the traditional MNIST digit dataset, providing a more complex and realistic challenge for machine learning models.

Dataset Characteristics:¶

  • Image Dimensions: 28×28 pixels (grayscale)
  • Character Classes: 26 alphabetic letters (A-Z)
  • Data Format: CSV format with pixel intensities and class labels
  • Scale: Large-scale dataset with substantial samples per class
  • Complexity: Natural handwriting variations, multiple writing styles

Initial Data Loading Process:¶

  1. Memory Management: Efficient loading strategies for large datasets
  2. Data Integrity: Verification of data completeness and format consistency
  3. Class Distribution: Analysis of label balance across alphabet classes
  4. Basic Statistics: Fundamental statistical properties of the dataset

Expected Challenges:¶

  • Class Imbalance: Potential uneven distribution across letter classes
  • Visual Similarity: Confusion between similar letters (e.g., 'O' vs '0', 'I' vs 'l')
  • Orientation Issues: Known data formatting challenges from original NIST database
  • Noise and Artifacts: Handling imperfect handwriting samples

In [2]:
# Set random seeds for reproducibility
random_state = 42
np.random.seed(random_state)
tf.random.set_seed(random_state)
In [3]:
# Load EMNIST Letters Dataset
emnist_data = pd.read_csv('emnist-letters-train.csv', header=None)
print(f"Dataset loaded successfully!")
print(f"Dataset shape: {emnist_data.shape}")
print(f"Memory usage: {emnist_data.memory_usage(deep=True).sum() / 1e6:.1f} MB")


# Display basic information about the dataset
print("\n Dataset Overview:")
print("="*50)
print(f"Total samples: {len(emnist_data):,}")
print(f"Features per image: {emnist_data.shape[1] - 1}")
print(f"Image dimensions: 28x28 pixels")
print(f"Data type: Grayscale handwritten letters")

# Show first few rows
print("\nFirst 5 rows of the dataset:")
emnist_data.head()
Dataset loaded successfully!
Dataset shape: (64829, 785)
Memory usage: 407.1 MB

 Dataset Overview:
==================================================
Total samples: 64,829
Features per image: 784
Image dimensions: 28x28 pixels
Data type: Grayscale handwritten letters

First 5 rows of the dataset:
Out[3]:
0 1 2 3 4 5 6 7 8 9 ... 775 776 777 778 779 780 781 782 783 784
0 24 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
1 -2 142 142 142 142 142 142 142 142 142 ... 142 142 142 142 142 142 142 142 142 142
2 15 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
3 14 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
4 -2 120 120 120 120 120 120 120 120 120 ... 120 120 120 120 120 120 120 120 120 120

5 rows × 785 columns

In [4]:
# Extract features and labels from the dataset
print("Preprocessing dataset...")

# Separate features (pixel values) and labels (class information)
X_raw = emnist_data.iloc[:, 1:].values  # All columns except first (pixel values)
y_raw = emnist_data.iloc[:, 0].values   # First column (class labels)

# Reshape images to 28x28 format and correct orientation
X_images = X_raw.reshape(-1, 28, 28)
X_images = X_images.transpose(0, 2, 1)  # we will show this later

print(f"Data preprocessing completed!")
print(f"Image array shape: {X_images.shape}")
print(f"Labels array shape: {y_raw.shape}")

# Analyze class distribution
unique_classes, class_counts = np.unique(y_raw, return_counts=True)
print(f"\nClass Distribution Analysis:")
print(f"• Unique classes found: {len(unique_classes)}")
print(f"• Class range: {unique_classes.min()} to {unique_classes.max()}")
print(f"• Total samples: {len(y_raw):,}")

# Create a comprehensive class distribution visualization
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Horizontal bar chart for class distribution
class_df = pd.DataFrame({'Class': unique_classes, 'Count': class_counts})
bars = ax1.barh(class_df['Class'], class_df['Count'], color=plt.cm.viridis(np.linspace(0, 1, len(unique_classes))))

# Add count annotations on bars
for i, (class_num, count) in enumerate(zip(unique_classes, class_counts)):
    ax1.text(count + 50, class_num, f'{count:,}', va='center', fontweight='bold')

ax1.set_xlabel('Number of Samples', fontweight='bold')
ax1.set_ylabel('Class Label', fontweight='bold')
ax1.set_title('Distribution of Classes in EMNIST Letters Dataset', fontweight='bold', fontsize=14)
ax1.grid(axis='x', alpha=0.3)

# Class distribution statistics
ax2.text(0.1, 0.9, f"Dataset Statistics:", fontsize=16, fontweight='bold', transform=ax2.transAxes)
ax2.text(0.1, 0.8, f"• Total Classes: {len(unique_classes)}", fontsize=12, transform=ax2.transAxes)
ax2.text(0.1, 0.7, f"• Samples per class (avg): {class_counts.mean():.0f}", fontsize=12, transform=ax2.transAxes)
ax2.text(0.1, 0.6, f"• Samples per class (std): {class_counts.std():.0f}", fontsize=12, transform=ax2.transAxes)
ax2.text(0.1, 0.5, f"• Most frequent class: {unique_classes[np.argmax(class_counts)]} ({class_counts.max():,} samples)", fontsize=12, transform=ax2.transAxes)
ax2.text(0.1, 0.4, f"• Least frequent class: {unique_classes[np.argmin(class_counts)]} ({class_counts.min():,} samples)", fontsize=12, transform=ax2.transAxes)
ax2.text(0.1, 0.3, f"• Balance ratio: {class_counts.max()/class_counts.min():.2f}:1", fontsize=12, transform=ax2.transAxes)

if -1 in unique_classes:
    empty_class_count = class_counts[unique_classes == -1][0]
    ax2.text(0.1, 0.2, f"Empty class (-1 and -2): {empty_class_count:,} samples", fontsize=12, color='red', transform=ax2.transAxes)

ax2.set_xlim(0, 1)
ax2.set_ylim(0, 1)
ax2.axis('off')

plt.tight_layout()
plt.show()
Preprocessing dataset...
Data preprocessing completed!
Image array shape: (64829, 28, 28)
Labels array shape: (64829,)

Class Distribution Analysis:
• Unique classes found: 18
• Class range: -2 to 26
• Total samples: 64,829
No description has been provided for this image

2.1 Initial Dataset Observations and Data Quality Assessment¶

  • There appears to be a few missing letters from the dataset, while the rest were mapped to a class.
  • Letters like 'C, H, K' and so on do not appear in this dataset. Which could probably be explained by this picture in the assignment brief:
  • However, although we already know what are the positive labels equivalent to. But what do the negative labels represent?

Key Findings from Initial Analysis:¶

Class Distribution Anomalies¶

  • Missing Letter Classes: Several alphabetic characters appear to be absent from the dataset
  • Negative Label Investigation: The presence of classes labeled as -1 and -2 requires investigation
  • Class Mapping Verification: Need to establish proper correspondence between numeric labels and alphabetic characters

Data Quality Indicators¶

The dataset exhibits characteristics typical of real-world handwriting data:

  • Class Imbalance: Uneven distribution of samples across letter classes
  • Sample Variability: Wide range of handwriting styles within each class
  • Edge Cases: Presence of potentially corrupted or mislabeled samples

Technical Considerations¶

Based on the assignment documentation and our initial findings:

  • Label Encoding: Numeric class labels require mapping to corresponding alphabetic characters
  • Data Preprocessing: Additional cleaning steps needed to handle negative labels and missing classes
  • Quality Control: Implementation of validation procedures to ensure data integrity

Negative Label Investigation¶

The presence of negative class labels (-1, -2) in the dataset suggests:

  • Rejected Samples: These may represent samples that failed quality control during original dataset creation
  • Unlabeled Data: Potentially represents characters that couldn't be confidently classified
  • Data Artifacts: Could be remnants from the original NIST database preprocessing pipeline

Next Steps: We will examine these negative label samples visually to determine their nature and decide on appropriate handling strategies.


Exploratory Data Analysis & Visualization¶

Now let's dive deeper into understanding our dataset through comprehensive exploratory data analysis. This will help us identify patterns, potential challenges, and characteristics that will inform our GAN architecture decisions.


In [5]:
# Define invalid class labels
invalid_classes = [-2, -1]
invalid_mask = np.isin(y_raw, invalid_classes)
invalid_indices = np.where(invalid_mask)[0]

# Show how many are found
print(f"Found {len(invalid_indices):,} samples with invalid labels: {invalid_classes}")

# Display up to 10 invalid samples
n_display = min(10, len(invalid_indices))
if n_display > 0:
    fig, axes = plt.subplots(1, n_display, figsize=(n_display * 2, 2))
    fig.suptitle("Examples of Invalid Class Samples (-1, -2)", fontsize=14, fontweight='bold')
    
    for i in range(n_display):
        idx = invalid_indices[i]
        axes[i].imshow(X_images[idx], cmap='gray')
        axes[i].set_title(f"Label: {y_raw[idx]}", fontsize=9)
        axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()
else:
    print("No samples with invalid class labels found.")
Found 10,240 samples with invalid labels: [-2, -1]
No description has been provided for this image
  • It appears that the negatively labelled classes are blank and could only serve to harm our training.
  • Thus, they will be dropped and deleted.
In [ ]:
# Load EMNIST Letters Dataset
emnist_data = pd.read_csv('emnist-letters-train.csv', header=None)
print(f"Dataset loaded successfully!")

# Separate features (pixel values) and labels (class information)
X_raw = emnist_data.iloc[:, 1:].values  # All columns except first (pixel values)
y_raw = emnist_data.iloc[:, 0].values   # First column (class labels)

# Reshape images to 28x28 format and correct orientation
X_images = X_raw.reshape(-1, 28, 28)

# Create comprehensive letter mapping and sample visualization
print("Analyzing Letter Classes and Creating Mappings...")

# Remove empty class (-1) if it exists and create proper mappings
valid_classes = [cls for cls in unique_classes if cls != -1]
valid_classes.sort()

# Create letter mapping (EMNIST letters: 1=A, 2=B, ..., 26=Z)
letter_mapping = {i: chr(64 + i) for i in range(1, 27)}  # 1→A, 2→B, etc.
available_letters = [letter_mapping[cls] for cls in valid_classes if cls in letter_mapping]

# Filter data to remove invalid classes (-2, -1)
invalid_classes = [-2, -1]
mask = ~np.isin(y_raw, invalid_classes)
X_filtered = X_images[mask]
y_filtered = y_raw[mask]
removed_count = np.sum(~mask)

# Display sample images for each letter class
print(f"\nSample Images from Each Letter Class:")
available_classes = sorted(np.unique(y_filtered))
samples_per_class = 6

# Calculate grid dimensions
n_classes = len(available_classes)
n_cols = min(13, n_classes)  # Max 13 columns for better display
n_rows_classes = (n_classes + n_cols - 1) // n_cols
total_rows = n_rows_classes * samples_per_class

fig, axes = plt.subplots(total_rows, n_cols, figsize=(n_cols * 1.5, total_rows * 1.5))
fig.suptitle('Sample Images from EMNIST Letters Dataset\n(6 samples per class)', fontsize=16, fontweight='bold')

# Flatten axes for easier indexing
if total_rows == 1:
    axes = axes.reshape(1, -1)
axes_flat = axes.flatten()

# Hide all axes initially
for ax in axes_flat:
    ax.axis('off')

# Plot samples for each class
for class_idx, class_label in enumerate(available_classes):
    # Get samples for this class
    class_samples = X_filtered[y_filtered == class_label]
    
    # Select random samples
    n_samples = min(samples_per_class, len(class_samples))
    sample_indices = np.random.choice(len(class_samples), n_samples, replace=False)
    
    for sample_idx in range(n_samples):
        # Calculate position in grid
        row = (class_idx // n_cols) * samples_per_class + sample_idx
        col = class_idx % n_cols
        
        if row < total_rows and col < n_cols:
            ax_idx = row * n_cols + col
            if ax_idx < len(axes_flat):
                ax = axes_flat[ax_idx]
                
                # Display image
                img = class_samples[sample_indices[sample_idx]]
                ax.imshow(img, cmap='gray', aspect='equal')
                ax.axis('off')
                
                # Add class label on first sample
                if sample_idx == 0:
                    letter = letter_mapping.get(class_label, f'Class {class_label}')
                    ax.set_title(f'{letter} ({class_label})', fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()
Dataset loaded successfully!
Analyzing Letter Classes and Creating Mappings...

Sample Images from Each Letter Class:
No description has been provided for this image

2.2 Image Orientation Analysis and Correction Strategy¶

Technical Root Cause Analysis¶

After thorough investigation of the dataset samples, we discovered systematic orientation issues affecting all images in the EMNIST Letters dataset. This phenomenon stems from the historical data collection and storage methodology used in the original NIST Special Database 19.

Historical Context and Technical Background¶

The orientation issues can be traced to the following technical factors:

  1. Fortran-based Storage Format:

    • The original NIST database stored images using Fortran's column-major memory layout
    • This resulted in images being stored column-by-column rather than row-by-row
    • When read using modern row-major languages (Python, C++), images appear rotated
  2. Scanning and Digitization Process:

    • Original documents were scanned using specialized equipment with different coordinate systems
    • The scanning process introduced a 90-degree counterclockwise rotation
    • Additional horizontal mirroring occurred during the digitization pipeline
  3. Data Format Conversion:

    • Conversion from binary format to CSV introduced additional transformation artifacts
    • Transpose operations during format conversion compounded the orientation issues

Observed Transformations¶

Our analysis reveals that images in the dataset have undergone:

  • 90-degree counterclockwise rotation: Letters appear sideways
  • Horizontal mirroring: Letters are flipped left-to-right
  • Coordinate system conversion: Original (x,y) coordinates are effectively transformed to (y,-x)

Correction Strategy Implementation¶

To restore proper letter orientation, we implement a two-step correction process:

  1. Transpose Operation: X_images.transpose(0, 2, 1) corrects the column-major to row-major conversion
  2. Coordinate System Adjustment: Proper handling of the (x,y) → (y,-x) transformation

This systematic approach ensures that our GAN models will train on properly oriented letter images, leading to more realistic and recognizable generated characters.

Impact on GAN Training¶

Proper image orientation is crucial for GAN performance because:

  • Feature Learning: Correctly oriented features enable better pattern recognition
  • Spatial Relationships: Proper spatial structure improves generator architecture effectiveness
  • Human Evaluation: Generated images will be interpretable and match expected letter appearances
In [7]:
# Load EMNIST Letters Dataset
emnist_data = pd.read_csv('emnist-letters-train.csv', header=None)
print(f"Dataset loaded successfully!")

# Separate features (pixel values) and labels (class information)
X_raw = emnist_data.iloc[:, 1:].values  # All columns except first (pixel values)
y_raw = emnist_data.iloc[:, 0].values   # First column (class labels)

# Reshape images to 28x28 format and correct orientation
X_images = X_raw.reshape(-1, 28, 28)
X_images = X_images.transpose(0, 2, 1)  # we will show this later

# Create comprehensive letter mapping and sample visualization
print("Analyzing Letter Classes and Creating Mappings...")

# Remove empty class (-1) if it exists and create proper mappings
valid_classes = [cls for cls in unique_classes if cls != -1]
valid_classes.sort()

# Create letter mapping (EMNIST letters: 1=A, 2=B, ..., 26=Z)
letter_mapping = {i: chr(64 + i) for i in range(1, 27)}  # 1→A, 2→B, etc.
available_letters = [letter_mapping[cls] for cls in valid_classes if cls in letter_mapping]

print(f"Letter Mapping Summary:")
print(f"• Valid classes: {valid_classes}")
print(f"• Available letters: {available_letters}")
print(f"• Number of letter classes: {len(valid_classes)}")

# Filter data to remove invalid classes (-2, -1)
invalid_classes = [-2, -1]
mask = ~np.isin(y_raw, invalid_classes)
X_filtered = X_images[mask]
y_filtered = y_raw[mask]
removed_count = np.sum(~mask)

# Display sample images for each letter class
print(f"\nSample Images from Each Letter Class:")
available_classes = sorted(np.unique(y_filtered))
samples_per_class = 6

# Calculate grid dimensions
n_classes = len(available_classes)
n_cols = min(13, n_classes)  # Max 13 columns for better display
n_rows_classes = (n_classes + n_cols - 1) // n_cols
total_rows = n_rows_classes * samples_per_class

fig, axes = plt.subplots(total_rows, n_cols, figsize=(n_cols * 1.5, total_rows * 1.5))
fig.suptitle('Sample Images from EMNIST Letters Dataset\n(6 samples per class)', fontsize=16, fontweight='bold')

# Flatten axes for easier indexing
if total_rows == 1:
    axes = axes.reshape(1, -1)
axes_flat = axes.flatten()

# Hide all axes initially
for ax in axes_flat:
    ax.axis('off')

# Plot samples for each class
for class_idx, class_label in enumerate(available_classes):
    # Get samples for this class
    class_samples = X_filtered[y_filtered == class_label]
    
    # Select random samples
    n_samples = min(samples_per_class, len(class_samples))
    sample_indices = np.random.choice(len(class_samples), n_samples, replace=False)
    
    for sample_idx in range(n_samples):
        # Calculate position in grid
        row = (class_idx // n_cols) * samples_per_class + sample_idx
        col = class_idx % n_cols
        
        if row < total_rows and col < n_cols:
            ax_idx = row * n_cols + col
            if ax_idx < len(axes_flat):
                ax = axes_flat[ax_idx]
                
                # Display image
                img = class_samples[sample_indices[sample_idx]]
                ax.imshow(img, cmap='gray', aspect='equal')
                ax.axis('off')
                
                # Add class label on first sample
                if sample_idx == 0:
                    letter = letter_mapping.get(class_label, f'Class {class_label}')
                    ax.set_title(f'{letter} ({class_label})', fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()
Dataset loaded successfully!
Analyzing Letter Classes and Creating Mappings...
Letter Mapping Summary:
• Valid classes: [-2, 1, 2, 4, 5, 6, 7, 9, 10, 12, 14, 15, 16, 17, 20, 24, 26]
• Available letters: ['A', 'B', 'D', 'E', 'F', 'G', 'I', 'J', 'L', 'N', 'O', 'P', 'Q', 'T', 'X', 'Z']
• Number of letter classes: 17

Sample Images from Each Letter Class:
No description has been provided for this image

3. Comprehensive Data Analysis and Feature Engineering¶

3.1 Statistical Analysis and Data Characteristics¶

Purpose and Methodology¶

This section conducts an in-depth statistical analysis of the EMNIST Letters dataset to understand the underlying data distribution, identify potential challenges, and inform architectural decisions for our GAN implementation. Our analysis employs multiple complementary techniques to provide a holistic view of the dataset characteristics.

Analysis Framework¶

Our comprehensive analysis encompasses several key dimensions:

3.1.1 Pixel-Level Statistical Analysis¶

  • Intensity Distribution: Understanding the distribution of pixel values across the entire dataset
  • Class-Specific Patterns: Identifying unique statistical signatures for different letter classes
  • Variance Analysis: Measuring pixel-wise variance to identify the most informative features
  • Sparsity Assessment: Quantifying the proportion of background vs. foreground pixels

3.1.2 Morphological and Structural Analysis¶

  • Edge Density: Using Sobel edge detection to measure structural complexity
  • Shape Characteristics: Analyzing geometric properties of letter strokes
  • Spatial Distribution: Understanding how letter features are distributed across the 28×28 grid

3.1.3 Dimensionality Reduction and Visualization¶

  • Principal Component Analysis (PCA): Linear dimensionality reduction for variance preservation
  • t-SNE Analysis: Non-linear embedding for cluster visualization and class separation assessment
  • Feature Importance: Identifying the most discriminative pixel positions

Strategic Implications for GAN Design¶

The insights from this analysis will directly inform our GAN architecture decisions:

  • Generator Complexity: Understanding data complexity helps determine required model capacity
  • Training Strategies: Statistical properties guide learning rate and optimization choices
  • Evaluation Metrics: Class characteristics inform appropriate quality assessment methods
  • Data Augmentation: Identifying sparse or challenging classes for targeted augmentation

Objectives and what is being achieved¶

  • Flattening of images:

    • We start of by flattening images from 28 by 28 to a 1-dimensional vector to enable statistical analysis
    • This is needed to calculate metrics like mean, standard deviation and histograms across all pixels.
  • Distribution visualizations:

    • Histograms highlight variations in pixel intensity distribution between different letters.
  • Mean and variance by class:

    • Mean pixel intensity (indicating how “bright” the average image is)
    • Variance (indicating how consistent the pixel values are across samples)
    • These measures are useful for understanding which characters are visually denser or more diverse.
  • Sparsity Analysis:

    • Calculates the percentage of zero-valued pixels* (background) for each class.
    • Helps to identify which letters are composed of fewer foreground pixels — important for preprocessing or augmenting sparse classes.
  • Image Complexity via Edge Detection:

    • Applies the Sobel filter to estimate edge density per class (a proxy for structural complexity).
    • Indicates how intricate the shapes of letters are, which is useful for understanding learning difficulty for models.

Why We Do This¶

  • To guide data preprocessing decisions: e.g., normalization, binarization, augmentation strategies.
  • To understand class-level imbalances: e.g., some classes may be more sparse, noisy, or complex than others.
  • To assess dataset quality and diversity, which influences the generalization ability of models.
  • To diagnose challenges that may occur during training, such as overfitting on less complex characters or poor performance on high-sparsity classes.
In [8]:
# Flatten images for statistical analysis
X_flat = X_filtered.reshape(X_filtered.shape[0], -1)
print(f"Flattened data shape: {X_flat.shape}")

# Analyze pixel intensity distribution
plt.figure(figsize=(15, 10))

# Overall pixel intensity distribution
plt.subplot(2, 3, 1)
plt.hist(X_flat.flatten(), bins=50, alpha=0.7, color='skyblue', edgecolor='black')
plt.title('Overall Pixel Intensity Distribution', fontweight='bold')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
plt.grid(True, alpha=0.3)

# Distribution by class (sample of classes)
plt.subplot(2, 3, 2)
sample_classes = sorted(np.unique(y_filtered))[:8]  # First 8 classes
colors = plt.cm.Set3(np.linspace(0, 1, len(sample_classes)))

for i, class_label in enumerate(sample_classes):
    class_data = X_flat[y_filtered == class_label]
    plt.hist(class_data.flatten(), bins=30, alpha=0.6, 
             label=f'Class {class_label}', color=colors[i], density=True)

plt.title('Pixel Distribution by Class (Sample)', fontweight='bold')
plt.xlabel('Pixel Intensity')
plt.ylabel('Density')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True, alpha=0.3)

# Mean intensity per class
plt.subplot(2, 3, 3)
class_means = []
class_labels = []
for class_label in sorted(np.unique(y_filtered)):
    class_data = X_flat[y_filtered == class_label]
    class_means.append(class_data.mean())
    class_labels.append(class_label)

plt.bar(range(len(class_means)), class_means, color='lightcoral', alpha=0.7)
plt.title('Mean Pixel Intensity by Class', fontweight='bold')
plt.xlabel('Class Index')
plt.ylabel('Mean Pixel Intensity')
plt.xticks(range(len(class_labels)), class_labels, rotation=45)
plt.grid(True, alpha=0.3)

# Variance analysis
plt.subplot(2, 3, 4)
class_vars = []
for class_label in sorted(np.unique(y_filtered)):
    class_data = X_flat[y_filtered == class_label]
    class_vars.append(class_data.var())

plt.bar(range(len(class_vars)), class_vars, color='lightgreen', alpha=0.7)
plt.title('Pixel Variance by Class', fontweight='bold')
plt.xlabel('Class Index')
plt.ylabel('Pixel Variance')
plt.xticks(range(len(class_labels)), class_labels, rotation=45)
plt.grid(True, alpha=0.3)

# Sparsity analysis (percentage of zero pixels)
plt.subplot(2, 3, 5)
sparsity_ratios = []
for class_label in sorted(np.unique(y_filtered)):
    class_data = X_flat[y_filtered == class_label]
    zero_ratio = (class_data == 0).sum() / class_data.size
    sparsity_ratios.append(zero_ratio * 100)

plt.bar(range(len(sparsity_ratios)), sparsity_ratios, color='gold', alpha=0.7)
plt.title('Sparsity Ratio by Class (%)', fontweight='bold')
plt.xlabel('Class Index')
plt.ylabel('Zero Pixels (%)')
plt.xticks(range(len(class_labels)), class_labels, rotation=45)
plt.grid(True, alpha=0.3)

# Image complexity analysis (edge detection)
plt.subplot(2, 3, 6)
from scipy import ndimage
edge_counts = []
for class_label in sorted(np.unique(y_filtered)):
    class_images = X_filtered[y_filtered == class_label]
    # Sample 100 images per class for efficiency
    sample_size = min(100, len(class_images))
    sample_indices = np.random.choice(len(class_images), sample_size, replace=False)
    
    total_edges = 0
    for idx in sample_indices:
        # Apply Sobel edge detection
        edges = ndimage.sobel(class_images[idx])
        total_edges += np.sum(edges > edges.mean())
    
    edge_counts.append(total_edges / sample_size)

plt.bar(range(len(edge_counts)), edge_counts, color='mediumpurple', alpha=0.7)
plt.title('Average Edge Count by Class', fontweight='bold')
plt.xlabel('Class Index')
plt.ylabel('Average Edge Pixels')
plt.xticks(range(len(class_labels)), class_labels, rotation=45)
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()
Flattened data shape: (54589, 784)
No description has been provided for this image
  1. Overall Pixel Intensity Distribution
  • The histogram is heavily right-skewed, with means majority of pixel values concentrated near 0 (black) and a secondary cluster near 255 (white).
  • This suggests that most pixels in the dataset are background (black), and only a small proportion represents strokes in the characters.
  • For instance, we see a spike around 255, which represents fully activated pixels—these likely belong to the letter strokes.
  1. Pixel Distribution by Class
  • Across classes (e.g 1 to 10), each class shares a similar distribution trend, which is a high frequency near 0 intensity.
  • Some classes (e.g Class 2 and Class 9) show slightly broader intensity distributions, indicating that the characters may use a wider range of pixel intensities.
  • This class-wise breakdown confirms that sparsity and brightness are not uniform across characters, which could affect model learning.\
  1. Mean Pixel Intensity by Class
  • There is notable variation in mean brightness across classes. For example:
    • Class 9 has one of the lowest mean intensities, indicating that the characters are likely sparse or lightly written.
    • Class 16 has the highest mean intensity, suggesting it has bolder strokes or denser structure.
  • These differences can inform normalization strategies or the need for class-wise augmentation to balance training.
  1. Pixel Variance by Class
  • Variance indicates how much pixel intensity varies within each class. For example:
    • Class 16 shows one of the highest variances, possibly due to inconsistent handwriting styles, varied stroke thicknesses, or structure complexity.
    • Class 10 has the lowest variance, suggesting more uniformity in how this character is written.
  • Classes with high variance might require more representative samples or invariant feature extraction in modeling.
  1. Sparsity Ratio by Class (%)
  • Sparsity ratio represents the proportion of pixels with intensity zero (blank).
  • Class 10 is the most sparse, with nearly 80% zero pixels, implying minimal stroke coverage.
  • In contrast, Class 14 is the least sparse, meaning it has denser character strokes.
  • Highly sparse classes may contribute less signal during training, so models might underperform on these if not balanced properly.
  1. Average Edge Count by Class
  • This approximates the visual complexity of each class based on edge detection. For example:
    • Class 14 has the highest edge count, suggesting intricate stroke patterns, possibly curves or loops.
    • Class 10 has the lowest edge count, which aligns with its high sparsity and low visual complexity.
  • Models may require more capacity or preprocessing (e.g., edge enhancement) to capture detail-rich characters.

Conclusion

  • These visual and statistical insights highlight the diversity in EMNIST character classes in terms of intensity, density, variance, sparsity, and complexity.
In [9]:
# Sample data for correlation analysis (use subset for computational efficiency)
sample_size = 10000
sample_indices = np.random.choice(len(X_flat), min(sample_size, len(X_flat)), replace=False)
X_sample = X_flat[sample_indices]
y_sample = y_filtered[sample_indices]

print(f"Using {len(X_sample):,} samples for correlation analysis")

# Calculate pixel-wise variance to identify most informative features
pixel_variance = np.var(X_sample, axis=0)
pixel_mean = np.mean(X_sample, axis=0)

# Reshape variance and mean to image format for visualization
variance_img = pixel_variance.reshape(28, 28)
mean_img = pixel_mean.reshape(28, 28)

# Create visualization of feature importance
fig, axes = plt.subplots(2, 4, figsize=(16, 8))
fig.suptitle('Feature Importance and Correlation Analysis', fontsize=16, fontweight='bold')

# Pixel variance heatmap
im1 = axes[0, 0].imshow(variance_img, cmap='viridis', aspect='equal')
axes[0, 0].set_title('Pixel Variance Map', fontweight='bold')
axes[0, 0].axis('off')
plt.colorbar(im1, ax=axes[0, 0], shrink=0.8)

# Mean pixel intensity heatmap
im2 = axes[0, 1].imshow(mean_img, cmap='plasma', aspect='equal')
axes[0, 1].set_title('Mean Pixel Intensity Map', fontweight='bold')
axes[0, 1].axis('off')
plt.colorbar(im2, ax=axes[0, 1], shrink=0.8)

# Most discriminative pixels (high variance)
high_var_threshold = np.percentile(pixel_variance, 90)
discriminative_pixels = (pixel_variance > high_var_threshold).reshape(28, 28)
axes[0, 2].imshow(discriminative_pixels, cmap='Reds', aspect='equal')
axes[0, 2].set_title('Most Discriminative Pixels\n(Top 10% Variance)', fontweight='bold')
axes[0, 2].axis('off')

# Least discriminative pixels (low variance)
low_var_threshold = np.percentile(pixel_variance, 10)
non_discriminative_pixels = (pixel_variance < low_var_threshold).reshape(28, 28)
axes[0, 3].imshow(non_discriminative_pixels, cmap='Blues', aspect='equal')
axes[0, 3].set_title('Least Discriminative Pixels\n(Bottom 10% Variance)', fontweight='bold')
axes[0, 3].axis('off')

# Variance distribution
axes[1, 0].hist(pixel_variance, bins=50, alpha=0.7, color='skyblue', edgecolor='black')
axes[1, 0].set_title('Distribution of Pixel Variances', fontweight='bold')
axes[1, 0].set_xlabel('Variance')
axes[1, 0].set_ylabel('Frequency')
axes[1, 0].grid(True, alpha=0.3)

# Mean distribution
axes[1, 1].hist(pixel_mean, bins=50, alpha=0.7, color='lightcoral', edgecolor='black')
axes[1, 1].set_title('Distribution of Pixel Means', fontweight='bold')
axes[1, 1].set_xlabel('Mean Intensity')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].grid(True, alpha=0.3)

# Correlation between variance and mean
axes[1, 2].scatter(pixel_mean, pixel_variance, alpha=0.6, s=1)
correlation_coef = np.corrcoef(pixel_mean, pixel_variance)[0, 1]
axes[1, 2].set_title(f'Variance vs Mean\n(r = {correlation_coef:.3f})', fontweight='bold')
axes[1, 2].set_xlabel('Mean Intensity')
axes[1, 2].set_ylabel('Variance')
axes[1, 2].grid(True, alpha=0.3)

# Top discriminative pixel positions
top_pixel_indices = np.argsort(pixel_variance)[-20:]  # Top 20 most discriminative
top_positions = [(idx // 28, idx % 28) for idx in top_pixel_indices]

# Create a visualization of top discriminative pixel positions
discriminative_map = np.zeros((28, 28))
for row, col in top_positions:
    discriminative_map[row, col] = 1

axes[1, 3].imshow(discriminative_map, cmap='hot', aspect='equal')
axes[1, 3].set_title('Top 20 Most Discriminative\nPixel Positions', fontweight='bold')
axes[1, 3].axis('off')

plt.tight_layout()
plt.show()


# Identify center vs edge pixel importance
center_region = slice(7, 21)  # Central 14x14 region
edge_region_mask = np.ones((28, 28), dtype=bool)
edge_region_mask[center_region, center_region] = False
Using 10,000 samples for correlation analysis
No description has been provided for this image

3.2 Feature Importance and Pixel Variability Analysis¶

Theoretical Framework¶

This analysis leverages statistical principles to identify which regions of the 28×28 pixel grid carry the most discriminative information for letter classification. By understanding pixel-level importance, we can optimize our GAN architecture and potentially implement attention mechanisms or focused training strategies.

Mathematical Foundation¶

The analysis employs several key statistical measures:

  1. Pixel Variance (σ²): Measures variability across all samples

    • High variance pixels: Active in many images with different intensities
    • Low variance pixels: Consistently background or consistently foreground
  2. Pixel Mean (μ): Average activation level across the dataset

    • Indicates regions frequently occupied by letter strokes
    • Helps identify central vs. peripheral importance
  3. Correlation Analysis: Relationship between mean activation and variance

    • Strong positive correlation suggests informative pixels are also variable pixels
    • Enables efficient feature selection strategies

Key Insights and Implications¶

Spatial Distribution Patterns:

  • Most discriminative pixels concentrate in the central region (approximately 14×14 core area)
  • Edge pixels (outer 7-pixel border) show minimal variation and limited discriminative power
  • This suggests potential for input dimensionality reduction without significant information loss

Information Density Analysis:

  • Approximately 20% of pixels carry 80% of the discriminative information
  • Strong correlation (r ≈ 0.983) between pixel brightness and variance indicates that active pixels are also the most informative
  • This principle can guide attention mechanisms in advanced GAN architectures

Class Separation Indicators:

  • High-variance pixels correspond to regions where letter strokes frequently appear but with varying patterns
  • These regions are critical for maintaining letter identity while allowing stylistic variation
  • Essential for generator networks to focus on these regions for realistic letter generation

Applications to GAN Architecture¶

This analysis directly informs several architectural decisions:

  • Input Focus: Generators can emphasize high-importance pixel regions
  • Loss Functions: Weighted loss functions can prioritize critical pixel regions
  • Evaluation Metrics: Quality assessment can focus on discriminative regions
  • Data Augmentation: Noise injection can be avoided in critical regions

3.3 Dimensionality Reduction and Cluster Analysis¶

t-distributed Stochastic Neighbor Embedding (t-SNE) Analysis¶

Theoretical Foundation and Mathematical Principles¶

t-SNE represents a breakthrough in non-linear dimensionality reduction, particularly effective for visualization and cluster analysis of high-dimensional data. Unlike linear methods such as PCA, t-SNE preserves local neighborhood structure while revealing global clustering patterns.

Mathematical Framework¶

Core Algorithm:

  1. Similarity Computation: Calculate pairwise similarities in high-dimensional space using Gaussian kernels
  2. Probability Distribution: Convert similarities to conditional probabilities
  3. Low-dimensional Mapping: Create corresponding probability distribution in 2D/3D space using t-distribution
  4. Optimization: Minimize Kullback-Leibler divergence between high and low-dimensional distributions

Key Mathematical Properties:

  • Perplexity Parameter: Controls the effective number of nearest neighbors

    • Low perplexity (5-50): Focuses on very local structure
    • High perplexity (50-100): Captures more global structure
    • Optimal range for our dataset: 30-100 samples
  • Learning Rate: Controls optimization step size

    • Too low: Slow convergence, potential local minima
    • Too high: Unstable optimization, poor clustering
    • Recommended range: 200-1000 for our dataset size

Application to EMNIST Letters Dataset¶

Research Objectives:

  1. Class Separability Assessment: Evaluate how well different letter classes form distinct clusters
  2. Similarity Analysis: Identify which letters are most/least similar in feature space
  3. Data Quality Validation: Detect potential mislabeled or ambiguous samples
  4. Feature Space Understanding: Visualize the intrinsic structure of handwritten letter variations

Hyperparameter Optimization Strategy: We systematically evaluate multiple hyperparameter combinations to ensure robust and interpretable results:

  • Perplexity Values: [30, 50, 100] to capture different scales of local structure
  • Learning Rates: [200, 500, 1000] to optimize convergence quality
  • Iterations: 1000+ for convergence assurance

Expected Outcomes and Interpretations¶

Successful Clustering Indicators:

  • Tight Intra-class Clusters: Similar letters group together with minimal scatter
  • Clear Inter-class Separation: Distinct boundaries between different letter clusters
  • Consistent Topology: Similar results across different hyperparameter settings

Challenging Cases to Identify:

  • Visually Similar Letters: Expected overlap between letters like 'O' and 'Q', 'I' and 'l'
  • Stylistic Variations: Multiple sub-clusters within single letter classes due to writing style differences
  • Outliers and Mislabels: Isolated points that may indicate data quality issues

Strategic Implications for GAN Development¶

The t-SNE analysis provides crucial insights for GAN architecture and training:

  • Mode Collapse Prevention: Understanding natural data clustering helps design generators that cover all modes
  • Quality Assessment: t-SNE of generated samples can validate diversity and realism
  • Targeted Training: Difficult-to-separate classes may require specialized attention during training
  • Evaluation Benchmarks: Clustering quality metrics provide quantitative evaluation standards
In [10]:
# Standardize the data before t-SNE
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X_sample)

# Use a subset of data for t-SNE (for speed)
tsne_sample_size = 10000
tsne_indices = np.random.choice(len(X_sample), min(tsne_sample_size, len(X_sample)), replace=False)
X_tsne_input = X_scaled[tsne_indices]
y_tsne_input = y_sample[tsne_indices]

print(f"Using {len(X_tsne_input):,} samples for t-SNE analysis")

# Perform t-SNE with different perplexity and learning rate values
perplexity_values = [30, 50, 100]
learning_rates = [200, 500, 1000]
tsne_results = {}

print("\nRunning t-SNE with different hyperparameters...")

# Test different perplexity values
for perp in perplexity_values:
    if perp < len(X_tsne_input):
        tsne = TSNE(n_components=2, perplexity=perp, learning_rate=200, 
                   random_state=42, verbose=0, n_iter=1000)
        X_tsne = tsne.fit_transform(X_tsne_input)
        tsne_results[f'perp_{perp}'] = {
            'embedding': X_tsne,
            'perplexity': perp,
            'learning_rate': 200
        }

# Test different learning rates
optimal_perp = 50 if 50 < len(X_tsne_input) else 30
for lr in learning_rates[1:]:  # Skip 200 (already computed)
    tsne = TSNE(n_components=2, perplexity=optimal_perp, learning_rate=lr, 
               random_state=42, verbose=0, n_iter=1000)
    X_tsne = tsne.fit_transform(X_tsne_input)
    tsne_results[f'lr_{lr}'] = {
        'embedding': X_tsne,
        'perplexity': optimal_perp,
        'learning_rate': lr
    }

# Create t-SNE visualization
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
fig.suptitle('t-SNE Analysis Results', fontsize=16, fontweight='bold')

# Plot perplexity results
for i, (key, result) in enumerate(list(tsne_results.items())[:3]):
    embedding = result['embedding']
    unique_classes_tsne = np.unique(y_tsne_input)
    colors = plt.cm.tab20(np.linspace(0, 1, len(unique_classes_tsne)))
    
    for j, class_label in enumerate(unique_classes_tsne):
        mask = y_tsne_input == class_label
        axes[0, i].scatter(embedding[mask, 0], embedding[mask, 1], 
                           c=[colors[j]], label=f'Class {class_label}', 
                           alpha=0.6, s=15)
    
    title = f"Perplexity = {result['perplexity']}" if 'perp_' in key else f"LR = {result['learning_rate']}"
    axes[0, i].set_title(f't-SNE Embedding\n{title}', fontweight='bold')
    axes[0, i].set_xlabel('t-SNE 1')
    axes[0, i].set_ylabel('t-SNE 2')
    axes[0, i].grid(True, alpha=0.3)

# Add legend to first plot
if len(unique_classes_tsne) <= 15:
    axes[0, 0].legend(bbox_to_anchor=(1.05, 1), loc='upper left', fontsize=8)

# Plot learning rate comparisons in bottom row
lr_keys = [key for key in tsne_results.keys() if 'lr_' in key]
for i, key in enumerate(lr_keys[:3]):
    embedding = tsne_results[key]['embedding']
    
    axes[1, i].hexbin(embedding[:, 0], embedding[:, 1], gridsize=30, cmap='Blues', alpha=0.7)
    
    title = f"Learning Rate = {tsne_results[key]['learning_rate']}"
    axes[1, i].set_title(f't-SNE Density Plot\n{title}', fontweight='bold')
    axes[1, i].set_xlabel('t-SNE 1')
    axes[1, i].set_ylabel('t-SNE 2')

plt.tight_layout()
plt.show()

# Cluster Analysis
print("\nt-SNE Cluster Analysis:")

best_key = list(tsne_results.keys())[0]
best_embedding = tsne_results[best_key]['embedding']

# Silhouette score
from sklearn.metrics import silhouette_score
try:
    silhouette_avg = silhouette_score(best_embedding, y_tsne_input)
    print(f"• Silhouette score: {silhouette_avg:.3f}")
except:
    print("• Silhouette score: Could not compute (likely due to single class)")

# Class centers
class_centers = {}
for class_label in np.unique(y_tsne_input):
    mask = y_tsne_input == class_label
    center = np.mean(best_embedding[mask], axis=0)
    class_centers[class_label] = center

# Inter-class distances
if len(class_centers) > 1:
    class_labels_list = list(class_centers.keys())
    distances = [np.linalg.norm(class_centers[a] - class_centers[b]) 
                 for i, a in enumerate(class_labels_list) 
                 for b in class_labels_list[i+1:]]
    
# Intra-class spread
intra_class_spreads = []
for class_label in np.unique(y_tsne_input):
    mask = y_tsne_input == class_label
    points = best_embedding[mask]
    center = class_centers[class_label]
    spread = np.mean(np.linalg.norm(points - center, axis=1))
    intra_class_spreads.append(spread)


# Store results
tsne_embedding = best_embedding
tsne_labels = y_tsne_input
Using 10,000 samples for t-SNE analysis

Running t-SNE with different hyperparameters...
No description has been provided for this image
t-SNE Cluster Analysis:
• Silhouette score: -0.048

Observations from the t-SNE visualization¶

  • The visual separation across t-SNE embeddings suggests that the dataset has underlying structure that can be captured non-linearly.
  • Some character classes may overlap due to visual similarity (e.g., 'C' and 'G'), but many appear distinguishable.
  • This type of analysis is valuable for pre-model inspection to assess class separability and for debugging latent feature quality in deep learning pipelines.

3.4 Class-wise Image Averaging and Prototype Analysis¶

Theoretical Background and Methodology¶

Purpose of Image Averaging Analysis¶

Image averaging is a fundamental technique in computer vision and pattern recognition that reveals the central tendency and canonical features of each character class. For handwritten character datasets like EMNIST, this analysis provides critical insights into class coherence, visual complexity, and potential training challenges.

Mathematical Foundation¶

Average Image Computation: For each class c, the average image Ā_c is computed as:

Ā_c = (1/n_c) * Σ(i=1 to n_c) I_i

Where:

  • n_c = number of samples in class c
  • I_i = individual image samples
  • Ā_c = resulting average image for class c

Key Analysis Dimensions¶

1. Visual Coherence Assessment:

  • Sharp Averages: Indicate consistent writing patterns across samples
  • Blurry Averages: Suggest high intra-class variability or mixed case letters
  • Distinct Features: Clear structural elements that persist across samples

2. Structural Complexity Evaluation:

  • Edge Preservation: How well key letter features survive averaging
  • Spatial Consistency: Consistent positioning of letter elements
  • Stroke Characteristics: Width, curvature, and connectivity patterns

3. Case Sensitivity Analysis: The EMNIST Letters dataset includes both uppercase and lowercase letters mapped to the same class labels, creating a unique analytical challenge:

  • Mixed Case Impact: Classes containing both cases will show blurred averages
  • Dominant Case Identification: Some classes may be dominated by one case type
  • Training Implications: Mixed case classes require specialized handling in GAN training

Expected Analytical Outcomes¶

High Coherence Classes (Expected Sharp Averages):

  • Letters with consistent structure across cases (e.g., 'C', 'O', 'S')
  • Letters where uppercase and lowercase are similar (e.g., 'C'/'c', 'O'/'o')
  • Classes with dominant writing styles

Low Coherence Classes (Expected Blurry Averages):

  • Letters with drastically different upper/lowercase forms (e.g., 'A'/'a', 'B'/'b', 'G'/'g')
  • Letters with high stylistic variation
  • Classes with balanced upper/lowercase representation

Strategic Implications for GAN Development¶

Training Strategy Adaptations:

  1. Class-Specific Architectures: Low coherence classes may benefit from conditional GANs
  2. Data Augmentation: High coherence classes can support more aggressive augmentation
  3. Loss Function Weighting: Blurry classes may require adjusted loss weights
  4. Mode Collapse Prevention: Understanding class complexity helps prevent generator collapse

Quality Assessment Benchmarks:

  • Average images serve as reference points for generated sample evaluation
  • Correlation with class averages can quantify generation quality
  • Structural preservation can be measured against canonical forms

Preprocessing Decisions:

  • Identification of classes requiring case separation
  • Understanding which features are essential vs. stylistic
  • Informing normalization and standardization approaches
In [28]:
# Dictionary to store average images for each class
class_averages = {}
class_sample_counts = {}

# Process each class
for class_num in sorted(np.unique(y_train)):
    # Get all images for this class
    class_mask = y_train == class_num
    class_images = X_train[class_mask]
    
    # Calculate average image
    avg_image = np.mean(class_images, axis=0)
    class_averages[class_num] = avg_image
    class_sample_counts[class_num] = len(class_images)
    
    print(f"Class {class_num}: {len(class_images)} samples, average calculated")

# Visualize average images for each class
def visualize_class_averages(class_averages, class_to_letter, title="Average Images by Class"):
    n_classes = len(class_averages)
    cols = min(8, n_classes)
    rows = (n_classes + cols - 1) // cols
    
    fig, axes = plt.subplots(rows, cols, figsize=(cols * 2, rows * 2))
    if rows == 1:
        axes = axes.reshape(1, -1)
    elif cols == 1:
        axes = axes.reshape(-1, 1)
    
    fig.suptitle(title, fontsize=16, fontweight='bold')
    
    for idx, (class_num, avg_image) in enumerate(sorted(class_averages.items())):
        row = idx // cols
        col = idx % cols
        
        ax = axes[row, col]
        ax.imshow(avg_image.squeeze(), cmap='gray')
        
        # Get letter for this class if available
        letter = class_to_letter.get(class_num, f"Class {class_num}")
        sample_count = class_sample_counts[class_num]
        
        ax.set_title(f"{letter}\n({sample_count} samples)", fontsize=10)
        ax.axis('off')
    
    # Hide unused subplots
    total_subplots = rows * cols
    for idx in range(n_classes, total_subplots):
        row = idx // cols
        col = idx % cols
        axes[row, col].axis('off')
    
    plt.tight_layout()
    plt.show()

# Display average images
visualize_class_averages(class_averages, class_to_letter)

# Calculate and display statistics about class averages
print("\nClass Average Image Statistics:")
print("=" * 50)
for class_num in sorted(class_averages.keys()):
    avg_image = class_averages[class_num]
    letter = class_to_letter.get(class_num, f"Class {class_num}")
    
    # Calculate statistics
    mean_intensity = np.mean(avg_image)
    std_intensity = np.std(avg_image)
    min_intensity = np.min(avg_image)
    max_intensity = np.max(avg_image)
    
    print(f"{letter:>2} (Class {class_num:2d}): "
          f"Mean={mean_intensity:.3f}, Std={std_intensity:.3f}, "
          f"Range=[{min_intensity:.3f}, {max_intensity:.3f}]")

# Compare class averages using correlation
print("\nClass Average Correlation Matrix:")
class_nums = sorted(class_averages.keys())
n_classes = len(class_nums)
correlation_matrix = np.zeros((n_classes, n_classes))

for i, class_i in enumerate(class_nums):
    for j, class_j in enumerate(class_nums):
        img_i = class_averages[class_i].flatten()
        img_j = class_averages[class_j].flatten()
        correlation = np.corrcoef(img_i, img_j)[0, 1]
        correlation_matrix[i, j] = correlation

# Visualize correlation matrix
plt.figure(figsize=(10, 8))
im = plt.imshow(correlation_matrix, cmap='coolwarm', vmin=-1, vmax=1)
plt.colorbar(im, label='Correlation Coefficient')
plt.title('Correlation Between Class Average Images', fontsize=14, fontweight='bold')

# Add class labels
class_labels = [class_to_letter.get(class_num, f"C{class_num}") for class_num in class_nums]
plt.xticks(range(n_classes), class_labels, rotation=45)
plt.yticks(range(n_classes), class_labels)

# Add correlation values as text
for i in range(n_classes):
    for j in range(n_classes):
        plt.text(j, i, f'{correlation_matrix[i, j]:.2f}', 
                ha='center', va='center', 
                color='white' if abs(correlation_matrix[i, j]) > 0.5 else 'black')

plt.tight_layout()
plt.show()
Class 0: 2377 samples, average calculated
Class 1: 2377 samples, average calculated
Class 2: 2378 samples, average calculated
Class 3: 2405 samples, average calculated
Class 4: 2376 samples, average calculated
Class 5: 2369 samples, average calculated
Class 6: 2400 samples, average calculated
Class 7: 2382 samples, average calculated
Class 8: 2391 samples, average calculated
Class 9: 2355 samples, average calculated
Class 10: 2386 samples, average calculated
Class 11: 2401 samples, average calculated
Class 12: 2405 samples, average calculated
Class 13: 2405 samples, average calculated
Class 14: 2405 samples, average calculated
Class 15: 2399 samples, average calculated
No description has been provided for this image
Class Average Image Statistics:
==================================================
 A (Class  0): Mean=-0.593, Std=0.419, Range=[-1.000, 0.326]
 B (Class  1): Mean=-0.622, Std=0.398, Range=[-1.000, 0.427]
 D (Class  2): Mean=-0.652, Std=0.365, Range=[-1.000, 0.334]
 E (Class  3): Mean=-0.559, Std=0.495, Range=[-1.000, 0.582]
 F (Class  4): Mean=-0.710, Std=0.353, Range=[-1.000, 0.392]
 G (Class  5): Mean=-0.617, Std=0.370, Range=[-1.000, 0.315]
 I (Class  6): Mean=-0.781, Std=0.390, Range=[-1.000, 0.703]
 J (Class  7): Mean=-0.731, Std=0.307, Range=[-1.000, 0.223]
 L (Class  8): Mean=-0.791, Std=0.309, Range=[-1.000, 0.367]
 N (Class  9): Mean=-0.617, Std=0.419, Range=[-1.000, 0.542]
 O (Class 10): Mean=-0.545, Std=0.531, Range=[-1.000, 0.552]
 P (Class 11): Mean=-0.673, Std=0.401, Range=[-1.000, 0.476]
 Q (Class 12): Mean=-0.608, Std=0.376, Range=[-1.000, 0.406]
 T (Class 13): Mean=-0.734, Std=0.315, Range=[-1.000, 0.229]
 X (Class 14): Mean=-0.640, Std=0.451, Range=[-1.000, 0.782]
 Z (Class 15): Mean=-0.608, Std=0.422, Range=[-1.000, 0.420]

Class Average Correlation Matrix:
No description has been provided for this image

3.4.1 Image Averaging Results and Strategic Insights¶

Quantitative Analysis of Class Coherence¶

High-Variance Classes Identified:¶

The image averaging analysis reveals several critical patterns that will significantly impact our GAN training strategy:

Letters with Dramatic Case Differences:

  • A/a: Uppercase triangular structure vs. lowercase circular form with tail
  • B/b: Uppercase double-bubble structure vs. lowercase ascender form
  • G/g: Uppercase C-like form vs. lowercase descender with loop
  • Q/q: Uppercase circular with tail vs. lowercase similar but different positioning

Technical Implications of Blurred Averages¶

Feature Degradation Analysis: When uppercase and lowercase versions are averaged:

  1. Edge Definition Loss: Critical structural edges become softened and indistinct
  2. Spatial Ambiguity: Key distinguishing features occupy different spatial regions
  3. Intensity Dilution: Important stroke information gets distributed across wider areas
  4. Structural Conflict: Contradictory features cancel each other out

Impact on GAN Learning Process:

  • Feature Extraction Challenges: Discriminator networks struggle with inconsistent targets
  • Generator Confusion: Unclear objectives lead to poor synthetic sample quality
  • Mode Collapse Risk: Generators may default to averaged, unrealistic forms
  • Training Instability: Conflicting signals can cause oscillatory training behavior

Strategic Recommendations for GAN Architecture¶

1. Conditional Generation Strategy:

  • Implement case-specific conditional GANs for high-variance classes
  • Separate training pathways for uppercase vs. lowercase letter generation
  • Label augmentation to include case information

2. Class-Weighted Training:

  • Adjust loss function weights based on class coherence scores
  • Higher weights for low-coherence classes to encourage better feature learning
  • Balanced sampling to ensure adequate representation of both cases

3. Multi-Scale Architecture Considerations:

  • Larger latent spaces for complex, multi-modal classes
  • Hierarchical generation approaches (case first, then specific letter)
  • Attention mechanisms focused on discriminative features

4. Evaluation Metric Adaptations:

  • Case-aware evaluation metrics that assess both uppercase and lowercase quality
  • Feature preservation metrics that account for structural differences
  • Human evaluation protocols that consider case appropriateness

Quality Control and Preprocessing Decisions¶

Based on these findings, our preprocessing pipeline will include:

  • Case Detection: Automated classification of samples by case where possible
  • Quality Filtering: Removal of ambiguous samples that lack clear case identity
  • Balanced Sampling: Ensuring adequate representation of both cases during training
  • Augmentation Strategies: Case-specific augmentation techniques to enhance learning

4. Advanced Data Preprocessing and Feature Engineering¶

4.1 Preprocessing Pipeline Architecture¶

Comprehensive Data Preparation Strategy¶

Building upon our extensive exploratory data analysis, this section implements a sophisticated preprocessing pipeline specifically designed for optimal GAN training performance. Our approach addresses the unique challenges identified in the EMNIST Letters dataset while implementing best practices for generative model training.

Core Preprocessing Objectives¶

1. Data Quality Enhancement:

  • Noise Reduction: Systematic removal of corrupted or mislabeled samples
  • Orientation Correction: Implementation of the image correction strategy developed earlier
  • Format Standardization: Consistent data format across all processing stages
  • Memory Optimization: Efficient data structures for large-scale training

2. Feature Engineering for GAN Training:

  • Intensity Normalization: Optimal pixel value scaling for neural network training
  • Spatial Augmentation: Controlled data augmentation to increase dataset diversity
  • Class Balancing: Strategies to address identified class imbalances
  • Latent Space Preparation: Preprocessing steps that facilitate smooth latent interpolation

3. Training Optimization:

  • Batch Processing: Efficient data loading and batching strategies
  • Memory Management: Techniques to handle large datasets within hardware constraints
  • Reproducibility: Deterministic preprocessing for consistent experimental results
  • Scalability: Modular design allowing for easy parameter adjustment

Advanced Preprocessing Components¶

Statistical Normalization: Based on our pixel-level analysis, we implement sophisticated normalization strategies:

  • Global Normalization: Centering pixel intensities around optimal ranges for GAN training
  • Local Contrast Enhancement: Improving feature distinction within individual samples
  • Variance Stabilization: Reducing the impact of varying writing intensities

Geometric Corrections: Building on our orientation analysis:

  • Systematic Rotation Correction: Applying the transpose operation identified earlier
  • Spatial Alignment: Ensuring consistent letter positioning within the 28×28 grid
  • Aspect Ratio Preservation: Maintaining natural letter proportions

Quality Control Pipeline: Implementing automated quality assessment:

  • Invalid Sample Detection: Systematic identification and removal of corrupted data
  • Outlier Analysis: Statistical detection of anomalous samples
  • Class Consistency Verification: Ensuring proper label-image correspondence

Integration with GAN Architecture¶

This preprocessing pipeline is specifically designed to complement our planned GAN architecture:

  • Input Range Optimization: Pixel values scaled to optimal ranges for generator/discriminator training
  • Gradient Flow Enhancement: Preprocessing choices that promote stable gradient flow
  • Mode Coverage: Data preparation that facilitates comprehensive mode coverage
  • Evaluation Compatibility: Preprocessing that enables meaningful comparison with real data during evaluation
In [12]:
class AdvancedDataProcessor:
    """
    Advanced data processing pipeline for EMNIST letters dataset
    Handles normalization, label mapping, and data augmentation
    """
    
    def __init__(self, target_classes=None):
        self.target_classes = target_classes
        self.label_mapping = {}
        self.reverse_mapping = {}
        self.class_to_letter = {}
        self.letter_to_class = {}
        self.num_classes = 0
        
    def create_label_mapping(self, labels):
        """Create consistent label mapping from original labels to 0-based indices"""
        if self.target_classes is not None:
            unique_labels = self.target_classes
        else:
            # Filter out invalid classes (-2, -1) and keep only valid letter classes (1-26)
            unique_labels = sorted([label for label in np.unique(labels) if 1 <= label <= 26])
        
        self.num_classes = len(unique_labels)
        
        # Create mappings
        for new_idx, original_label in enumerate(unique_labels):
            self.label_mapping[original_label] = new_idx
            self.reverse_mapping[new_idx] = original_label
            
            # Create letter mappings (1=A, 2=B, etc.)
            if 1 <= original_label <= 26:
                letter = chr(64 + original_label)  # Convert to letter
                self.class_to_letter[new_idx] = letter
                self.letter_to_class[letter] = new_idx
                
        return self.label_mapping
    
    def normalize_images(self, images, method='tanh'):
        """
        Normalize images for GAN training
        method: 'tanh' for [-1,1] range, 'sigmoid' for [0,1] range
        """
        if method == 'tanh':
            # Normalize to [-1, 1] range (recommended for GAN training)
            normalized = (images.astype(np.float32) - 127.5) / 127.5
            print(f"Images normalized to range [{normalized.min():.2f}, {normalized.max():.2f}]")
        elif method == 'sigmoid':
            # Normalize to [0, 1] range
            normalized = images.astype(np.float32) / 255.0
            print(f"Images normalized to range [{normalized.min():.2f}, {normalized.max():.2f}]")
        else:
            raise ValueError("Method must be 'tanh' or 'sigmoid'")
            
        return normalized
    
    def transform_labels(self, labels):
        """Transform original labels to new consecutive indices"""
        # Filter out invalid labels and map to new indices
        valid_mask = np.isin(labels, list(self.label_mapping.keys()))
        valid_labels = labels[valid_mask]
        transformed = np.array([self.label_mapping[label] for label in valid_labels])
        
        return transformed, valid_mask
    
    def get_class_distribution(self, labels):
        """Calculate class distribution for metrics"""
        hist = np.bincount(labels, minlength=self.num_classes)
        return hist / len(labels)
    
    def augment_data(self, images, labels, augmentation_factor=2):
        """
        Apply data augmentation techniques
        """
        print(f"Applying data augmentation (factor: {augmentation_factor})...")
        
        augmented_images = []
        augmented_labels = []
        
        # Keep original data
        augmented_images.append(images)
        augmented_labels.append(labels)
        
        for factor in range(1, augmentation_factor):
            batch_augmented = []
            
            for img in images:
                # Random rotation (-10 to 10 degrees)
                angle = np.random.uniform(-10, 10)
                rotated = self._rotate_image(img, angle)
                
                # Random small translation
                shift_x = np.random.randint(-2, 3)
                shift_y = np.random.randint(-2, 3)
                translated = self._translate_image(rotated, shift_x, shift_y)
                
                # Add slight noise
                noisy = self._add_gaussian_noise(translated, std=0.05)
                
                batch_augmented.append(noisy)
            
            augmented_images.append(np.array(batch_augmented))
            augmented_labels.append(labels.copy())
        
        # Combine all augmented data
        final_images = np.vstack(augmented_images)
        final_labels = np.hstack(augmented_labels)
        
        print(f"Augmentation complete: {len(images):,} → {len(final_images):,} samples")
        return final_images, final_labels
    
    def _rotate_image(self, image, angle):
        """Rotate image by given angle (in degrees)"""
        from scipy.ndimage import rotate
        return rotate(image, angle, reshape=False, mode='constant', cval=0)
    
    def _translate_image(self, image, shift_x, shift_y):
        """Translate image by given pixel amounts"""
        from scipy.ndimage import shift
        return shift(image, [shift_y, shift_x], mode='constant', cval=0)
    
    def _add_gaussian_noise(self, image, std=0.05):
        """Add Gaussian noise to image"""
        noise = np.random.normal(0, std, image.shape)
        return np.clip(image + noise, -1, 1)
    
# Initialize the data processor
data_processor = AdvancedDataProcessor()

# Create label mapping
label_mapping = data_processor.create_label_mapping(y_filtered)

# Transform labels to consecutive indices
y_processed, valid_mask = data_processor.transform_labels(y_filtered)
X_processed = X_filtered[valid_mask]

# Normalize images for GAN training (to [-1, 1] range)
X_normalized = data_processor.normalize_images(X_processed, method='tanh')

# Add channel dimension for CNN compatibility
X_final = X_normalized.reshape(-1, 28, 28, 1)
y_final = y_processed



# Store global variables for easy access
num_classes = data_processor.num_classes
class_to_letter = data_processor.class_to_letter
letter_to_class = data_processor.letter_to_class
label_mapping_dict = data_processor.label_mapping
Images normalized to range [-1.00, 1.00]
In [13]:
def validate_data_quality(X, y, processor):
    """
    Comprehensive data quality validation
    """
    
    # Check for NaN or infinite values
    nan_count = np.isnan(X).sum()
    inf_count = np.isinf(X).sum()
    print(f"• NaN values: {nan_count}")
    print(f"• Infinite values: {inf_count}")
    
    # Check image value ranges
    print(f"• Image value range: [{X.min():.3f}, {X.max():.3f}]")
    print(f"• Image mean: {X.mean():.3f}")
    print(f"• Image std: {X.std():.3f}")
    
    # Check label distribution
    unique_labels, counts = np.unique(y, return_counts=True)
    print(f"• Number of unique labels: {len(unique_labels)}")
    print(f"• Label range: [{unique_labels.min()}, {unique_labels.max()}]")
    
    # Class balance assessment
    min_samples = counts.min()
    max_samples = counts.max()
    balance_ratio = min_samples / max_samples
    print(f"• Samples per class - Min: {min_samples}, Max: {max_samples}")
    print(f"• Class balance ratio: {balance_ratio:.3f}")
    
    if balance_ratio < 0.5:
        print("Warning: Significant class imbalance detected!")
    else:
        print("Classes are reasonably balanced")
    
    # Sample some images to check for artifacts
    sample_indices = np.random.choice(len(X), 5, replace=False)
    print(f"\nSample image statistics:")
    for i, idx in enumerate(sample_indices):
        img = X[idx].squeeze()
        print(f"  Sample {i+1}: mean={img.mean():.3f}, std={img.std():.3f}, "
              f"label={y[idx]} ({processor.class_to_letter.get(y[idx], '?')})")
    
    return {
        'nan_count': nan_count,
        'inf_count': inf_count,
        'value_range': (X.min(), X.max()),
        'balance_ratio': balance_ratio,
        'total_samples': len(X),
        'num_classes': len(unique_labels)
    }

def create_data_splits(X, y, test_size=0.2, validation_size=0.1, random_state=42):
    """
    Create train/validation/test splits with stratification
    """

    # First split: separate test set
    X_temp, X_test, y_temp, y_test = train_test_split(
        X, y, test_size=test_size, random_state=random_state, stratify=y
    )
    
    # Second split: separate validation from training
    val_size_adjusted = validation_size / (1 - test_size)
    X_train, X_val, y_train, y_val = train_test_split(
        X_temp, y_temp, test_size=val_size_adjusted, 
        random_state=random_state, stratify=y_temp
    )
    
    # Verify class distribution in each split
    for split_name, y_split in [("Training", y_train), ("Validation", y_val), ("Test", y_test)]:
        unique, counts = np.unique(y_split, return_counts=True)
        print(f"• {split_name} classes: {len(unique)} unique labels")
    
    return X_train, X_val, X_test, y_train, y_val, y_test

# Perform data quality validation
quality_report = validate_data_quality(X_final, y_final, data_processor)

# Create data splits for training
X_train, X_val, X_test, y_train, y_val, y_test = create_data_splits(
    X_final, y_final, test_size=0.15, validation_size=0.15, random_state=42
)
• NaN values: 0
• Infinite values: 0
• Image value range: [-1.000, 1.000]
• Image mean: -0.655
• Image std: 0.663
• Number of unique labels: 16
• Label range: [0, 15]
• Samples per class - Min: 3365, Max: 3437
• Class balance ratio: 0.979
Classes are reasonably balanced

Sample image statistics:
  Sample 1: mean=-0.574, std=0.692, label=0 (A)
  Sample 2: mean=-0.563, std=0.712, label=9 (N)
  Sample 3: mean=-0.724, std=0.618, label=8 (L)
  Sample 4: mean=-0.591, std=0.730, label=2 (D)
  Sample 5: mean=-0.358, std=0.828, label=5 (G)
• Training classes: 16 unique labels
• Validation classes: 16 unique labels
• Test classes: 16 unique labels
In [14]:
def visualize_processed_data(X_train, y_train, processor, samples_per_class=3):
    """
    Create comprehensive visualization of processed training data
    """
    print("Creating enhanced data visualization...")
    
    # Get unique classes and their counts
    unique_classes = np.unique(y_train)
    class_counts = np.bincount(y_train, minlength=processor.num_classes)
    
    # Calculate grid dimensions
    n_classes = len(unique_classes)
    n_cols = min(6, n_classes)
    n_rows = (n_classes + n_cols - 1) // n_cols
    
    # Create the main visualization
    fig = plt.figure(figsize=(20, 4 * n_rows))
    
    # Add main title
    fig.suptitle('Processed EMNIST Letters Dataset - Sample Overview', 
                 fontsize=20, fontweight='bold', y=0.98)
    
    for idx, class_idx in enumerate(unique_classes):
        # Get samples for this class
        class_mask = y_train == class_idx
        class_images = X_train[class_mask]
        
        # Sample random images from this class
        if len(class_images) >= samples_per_class:
            sample_indices = np.random.choice(len(class_images), samples_per_class, replace=False)
            samples = class_images[sample_indices]
        else:
            samples = class_images
            
        # Create subplot for this class
        ax = plt.subplot(n_rows, n_cols, idx + 1)
        
        # Create a mosaic of samples
        if len(samples) >= 3:
            # Arrange 3 samples in a row
            combined = np.hstack([samples[i].squeeze() for i in range(min(3, len(samples)))])
        else:
            combined = samples[0].squeeze()
            
        # Display the combined image
        ax.imshow(combined, cmap='gray', vmin=-1, vmax=1)
        
        # Get letter representation
        letter = processor.class_to_letter.get(class_idx, '?')
        count = class_counts[class_idx]
        
        # Set title with class info
        ax.set_title(f'Class {class_idx}: "{letter}"\n{count:,} samples', 
                    fontsize=12, fontweight='bold')
        ax.axis('off')
    
    # Remove empty subplots
    for idx in range(n_classes, n_rows * n_cols):
        plt.subplot(n_rows, n_cols, idx + 1)
        plt.axis('off')
    
    plt.tight_layout()
    plt.show()
    
    # Create class distribution bar chart
    plt.figure(figsize=(15, 8))
    
    # Prepare data for plotting
    letters = [processor.class_to_letter.get(i, f'Class_{i}') for i in unique_classes]
    counts = [class_counts[i] for i in unique_classes]
    
    # Create bar plot
    bars = plt.bar(letters, counts, color=plt.cm.Set3(np.linspace(0, 1, len(letters))), 
                   alpha=0.8, edgecolor='black', linewidth=0.8)
    
    # Customize the plot
    plt.title('Class Distribution in Training Set', fontsize=16, fontweight='bold', pad=20)
    plt.xlabel('Letter Classes', fontsize=14, fontweight='bold')
    plt.ylabel('Number of Samples', fontsize=14, fontweight='bold')
    
    # Add value labels on bars
    for bar, count in zip(bars, counts):
        plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + max(counts)*0.01,
                f'{count:,}', ha='center', va='bottom', fontweight='bold')
    
    # Add statistics
    mean_samples = np.mean(counts)
    std_samples = np.std(counts)
    plt.axhline(y=mean_samples, color='red', linestyle='--', alpha=0.7, 
                label=f'Mean: {mean_samples:.0f}')
    plt.axhline(y=mean_samples + std_samples, color='orange', linestyle=':', alpha=0.7,
                label=f'Mean + STD: {mean_samples + std_samples:.0f}')
    plt.axhline(y=mean_samples - std_samples, color='orange', linestyle=':', alpha=0.7,
                label=f'Mean - STD: {mean_samples - std_samples:.0f}')
    
    plt.legend(loc='upper right')
    plt.grid(True, alpha=0.3, axis='y')
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

# Visualize the processed data
visualize_processed_data(X_train, y_train, data_processor, samples_per_class=3)
Creating enhanced data visualization...
No description has been provided for this image
No description has been provided for this image
  • Looking at our dataset for different classes now, we can observe that some letters are unclean and harmful for our model as shown

here:

  • image.png image-2.png

  • However, due to the nature of our dataset. It will be very time consuming to go through the dataset and remove the harmful data one by one. We should not do further processing and cleaning.


5. Generative Adversarial Network Implementation¶

5.1 GAN Architecture Overview and Training Methodology¶

Theoretical Foundation of GANs¶

Generative Adversarial Networks, introduced by Ian Goodfellow et al. in 2014, represent a groundbreaking approach to generative modeling through adversarial training. The framework consists of two neural networks competing in a minimax game: a generator network G and a discriminator network D.

Mathematical Framework¶

The GAN objective function is formulated as:

min max V(D,G) = E[log D(x)] + E[log(1 - D(G(z)))]
 G   D

Where:

  • G(z): Generator function mapping random noise z to synthetic data
  • D(x): Discriminator function outputting probability that x is real data
  • E[·]: Expected value over the respective data distributions

Training Dynamics and Convergence¶

The adversarial training process involves alternating optimization:

Generator Training Objective:

  • Primary Goal: Minimize log(1 - D(G(z))) or equivalently maximize log(D(G(z)))
  • Strategy: Produce synthetic images that fool the discriminator into classifying them as real
  • Gradient Flow: Backpropagation through discriminator provides learning signal

Discriminator Training Objective:

  • Primary Goal: Maximize log(D(x)) + log(1 - D(G(z)))
  • Strategy: Accurately distinguish between real dataset images and generator-produced fakes
  • Learning Process: Binary classification task with continuous adaptation to generator improvements

Conditional GAN Extension¶

Our implementation extends the basic GAN framework to Conditional GANs (cGANs) for class-specific letter generation:

  • Conditional Generator: G(z,c) where c represents the desired letter class
  • Conditional Discriminator: D(x,c) that must identify both authenticity and class correctness
  • Enhanced Control: Enables targeted generation of specific alphabet letters

5.2 Training Architecture and Optimization Strategy¶

Adversarial Training Pipeline¶

The training architecture implements a sophisticated adversarial learning framework designed to achieve stable convergence and high-quality letter generation. The process alternates between two competing objectives in a carefully orchestrated manner.

Generator Training Phase¶

Objective: Maximize the discriminator's prediction confidence for generated samples

Training Process:

  1. Noise Sampling: Generate random latent vectors z ~ N(0,I) from 100-dimensional normal distribution
  2. Class Conditioning: Sample target letter classes c uniformly from available alphabet classes
  3. Image Generation: Produce synthetic letters G(z,c) using the generator network
  4. Adversarial Feedback: Compute gradients based on discriminator's assessment of generated samples
  5. Parameter Updates: Adjust generator weights to improve synthetic image quality

Key Strategies:

  • Feature Learning: Learn meaningful representations that capture letter structure and style
  • Class Consistency: Ensure generated letters match specified class labels
  • Diversity Maintenance: Avoid mode collapse by generating varied samples within each class

Discriminator Training Phase¶

Objective: Accurately classify real vs. generated images while predicting correct letter classes

Training Process:

  1. Real Data Processing: Examine authentic EMNIST letter samples with their ground truth labels
  2. Generated Data Assessment: Evaluate synthetic letters produced by the current generator
  3. Binary Classification: Learn to distinguish between real and fake samples
  4. Multi-class Classification: Predict the correct letter class for both real and generated images
  5. Gradient Computation: Update parameters to improve classification accuracy

Key Strategies:

  • Pattern Recognition: Learn discriminative features that distinguish authentic handwriting patterns
  • Class-aware Learning: Develop robust letter classification capabilities
  • Adversarial Robustness: Maintain accuracy despite continuously improving generator quality

Training Balance and Stability¶

Achieving optimal training balance is crucial for GAN success:

Common Training Challenges:

  • Discriminator Dominance: When discriminator becomes too powerful, generator receives poor gradients
  • Generator Collapse: When generator produces limited sample diversity (mode collapse)
  • Training Oscillation: Unstable training dynamics leading to poor convergence

Our Stability Solutions:

  • Balanced Learning Rates: Carefully tuned optimizer parameters for each network
  • Progressive Training: Gradual complexity increase during training progression
  • Regularization Techniques: Methods to prevent overfitting and promote generalization

5.3 Baseline DCGAN Implementation¶

Architecture Overview and Design Principles¶

This section implements a standard Deep Convolutional GAN (DCGAN) following the seminal 2015 paper by Radford et al. This baseline implementation serves as our primary benchmark for evaluating the effectiveness of subsequent architectural enhancements and training improvements.

DCGAN Architectural Guidelines¶

Our baseline implementation strictly adheres to the original DCGAN architectural principles:

Generator Architecture Principles¶

  1. Fractionally-strided Convolutions: Use Conv2DTranspose layers for upsampling instead of upsampling + convolution
  2. Batch Normalization: Applied to all layers except the output layer to stabilize training
  3. ReLU Activation: Used in all generator layers except output (which uses Tanh)
  4. No Fully Connected Layers: Except for the initial projection from latent space
  5. Kernel Size 5x5: Standard kernel size for both generator and discriminator

Discriminator Architecture Principles¶

  1. Strided Convolutions: Use Conv2D with stride 2 for downsampling instead of pooling
  2. LeakyReLU Activation: Applied throughout discriminator with α = 0.2
  3. No Batch Normalization: On discriminator input layer to avoid sample batch dependencies
  4. Dropout Regularization: Applied to prevent overfitting and improve generalization

Technical Implementation Details¶

Generator Network Specification¶

  • Input: 100-dimensional random noise vector + embedded class label
  • Architecture Flow: Latent → Dense → Reshape → Conv2DTranspose Blocks → Output
  • Progressive Upsampling: 7×7 → 14×14 → 28×28 pixel resolution
  • Channel Progression: 256 → 128 → 64 → 1 (grayscale output)
  • Output Activation: Tanh function producing values in [-1, 1] range

Discriminator Network Specification¶

  • Input: 28×28×1 grayscale images + embedded class labels
  • Architecture Flow: Input → Conv2D Blocks → Flatten → Dense → Outputs
  • Progressive Downsampling: 28×28 → 14×14 → 7×7 → 4×4 spatial resolution
  • Channel Progression: 1 → 64 → 128 → 256 feature channels
  • Dual Outputs: Binary authenticity classification + multi-class letter prediction

Training Configuration and Hyperparameters¶

Optimization Strategy¶

  • Optimizer: Adam with standard DCGAN parameters (lr=0.0002, β₁=0.5)
  • Loss Functions: Binary Cross-Entropy for real/fake + Sparse Categorical Cross-Entropy for classes
  • Batch Size: 64 samples per batch for stable gradient estimates
  • Training Duration: 50 epochs for comprehensive evaluation

Data Preprocessing Alignment¶

  • Normalization: Images scaled to [-1, 1] range to match Tanh output
  • Class Conditioning: Embedded label vectors concatenated with latent noise
  • Batch Construction: Stratified sampling to ensure class balance within batches

Expected Performance Characteristics¶

Baseline Quality Metrics¶

This implementation establishes our baseline performance benchmarks:

  • Visual Quality: Standard DCGAN letter generation quality
  • Training Stability: Traditional GAN training dynamics with potential oscillation
  • Mode Coverage: Baseline diversity across letter classes
  • Convergence Rate: Standard convergence characteristics for comparison

Known Limitations¶

As a baseline implementation, this model exhibits typical DCGAN limitations:

  • Training Instability: Potential discriminator-generator imbalance
  • Mode Collapse Risk: Reduced sample diversity in challenging letter classes
  • Limited Feature Quality: Basic feature learning without advanced techniques
  • Gradient Flow Issues: Standard backpropagation without enhancement mechanisms

This baseline serves as the foundation for our subsequent enhanced implementations, providing quantitative benchmarks for measuring the effectiveness of advanced techniques such as spectral normalization, residual connections, and balanced training strategies.

In [16]:
# =============================================================================
# BASELINE DCGAN - COMPLETE IMPLEMENTATION
# =============================================================================

def build_baseline_generator(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """
    Build baseline DCGAN generator following original DCGAN paper
    Standard architecture with Conv2DTranspose layers
    """
    print(" Building baseline generator...")
    
    # Noise input
    noise_input = tf.keras.layers.Input(shape=(latent_dim,), name='noise_input')
    
    # Label input for conditional generation
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Embed labels
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate noise and label
    x = tf.keras.layers.Concatenate()([noise_input, label_embedding])
    
    # Initial dense layer: project to start convolution
    x = tf.keras.layers.Dense(7 * 7 * 256, use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Reshape to start convolution: 7x7x256
    x = tf.keras.layers.Reshape((7, 7, 256))(x)
    
    # First upsampling: 7x7x256 -> 14x14x128
    x = tf.keras.layers.Conv2DTranspose(128, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Second upsampling: 14x14x128 -> 28x28x64
    x = tf.keras.layers.Conv2DTranspose(64, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Final layer: 28x28x64 -> 28x28x1
    x = tf.keras.layers.Conv2DTranspose(1, 5, strides=1, padding='same', activation='tanh')(x)
    
    model = tf.keras.Model(
        inputs=[noise_input, label_input],
        outputs=x,
        name='baseline_generator'
    )
    
    return model

def build_baseline_discriminator(img_height=28, img_width=28, num_classes=16):
    """
    Build baseline DCGAN discriminator following original DCGAN paper
    Standard architecture with Conv2D layers
    """
    print(" Building baseline discriminator...")
    
    # Image input
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1), name='img_input')
    
    # Label input
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process image
    x = img_input
    
    # First conv block: 28x28x1 -> 14x14x64
    x = tf.keras.layers.Conv2D(64, 5, strides=2, padding='same')(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Second conv block: 14x14x64 -> 7x7x128
    x = tf.keras.layers.Conv2D(128, 5, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Third conv block: 7x7x128 -> 4x4x256
    x = tf.keras.layers.Conv2D(256, 5, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Flatten for dense layers
    x = tf.keras.layers.Flatten()(x)
    
    # Process label
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate image features and label
    x = tf.keras.layers.Concatenate()([x, label_embedding])
    
    # Dense layers
    x = tf.keras.layers.Dense(1024)(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    # Output layer (real/fake classification)
    validity = tf.keras.layers.Dense(1, activation='sigmoid', name='validity')(x)
    
    # Auxiliary classifier for label prediction
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(x)
    
    model = tf.keras.Model(
        inputs=[img_input, label_input],
        outputs=[validity, label_pred],
        name='baseline_discriminator'
    )
    
    return model

# Build baseline models
baseline_generator = build_baseline_generator(
    latent_dim=100, 
    num_classes=num_classes, 
    img_height=28, 
    img_width=28
)

baseline_discriminator = build_baseline_discriminator(
    img_height=28, 
    img_width=28, 
    num_classes=num_classes
)
 Building baseline generator...
 Building baseline discriminator...
In [17]:
class GANEvaluationMetrics:
    """
    Comprehensive evaluation metrics for GAN models
    Includes FID, KL divergence, and visualization tools
    """
    
    def __init__(self, data_processor):
        self.data_processor = data_processor
        self.inception_model = None
        self.initialize_inception_model()
        
    def initialize_inception_model(self):
        """Initialize InceptionV3 model for FID calculation"""
        try:
            # Load InceptionV3 without top layers for feature extraction
            self.inception_model = tf.keras.applications.InceptionV3(
                include_top=False,
                pooling='avg',
                input_shape=(299, 299, 3)
            )
            print("InceptionV3 model loaded for FID calculation")
        except Exception as e:
            print(f"Warning: Could not load InceptionV3 model: {e}")
            print("FID calculation will be disabled")
    
    def preprocess_images_for_inception(self, images):
        """Preprocess images for InceptionV3 (resize to 299x299, convert to RGB)"""
        if self.inception_model is None:
            return None
            
        # Convert from [-1, 1] to [0, 255]
        images = (images + 1) * 127.5
        images = tf.cast(images, tf.uint8)
        
        # Resize to 299x299
        images = tf.image.resize(images, [299, 299])
        
        # Convert grayscale to RGB
        images = tf.image.grayscale_to_rgb(images)
        
        # Preprocess for InceptionV3
        images = tf.cast(images, tf.float32)
        images = tf.keras.applications.inception_v3.preprocess_input(images)
        
        return images
    
    def calculate_fid(self, real_images, generated_images, batch_size=50):
        """
        Calculate Frechet Inception Distance (FID)
        Lower FID indicates better quality and diversity
        """
        if self.inception_model is None:
            print("InceptionV3 not available, skipping FID calculation")
            return None
            
        print("Calculating FID score...")
        
        # Preprocess images
        real_processed = self.preprocess_images_for_inception(real_images)
        gen_processed = self.preprocess_images_for_inception(generated_images)
        
        if real_processed is None or gen_processed is None:
            return None
        
        # Get features in batches to avoid memory issues
        def get_features_batched(images, model, batch_size):
            features = []
            for i in range(0, len(images), batch_size):
                batch = images[i:i+batch_size]
                batch_features = model(batch)
                features.append(batch_features)
            return tf.concat(features, axis=0)
        
        # Extract features
        real_features = get_features_batched(real_processed, self.inception_model, batch_size)
        gen_features = get_features_batched(gen_processed, self.inception_model, batch_size)
        
        # Calculate statistics
        mu_real = tf.reduce_mean(real_features, axis=0)
        mu_gen = tf.reduce_mean(gen_features, axis=0)
        
        sigma_real = tfp.stats.covariance(real_features)
        sigma_gen = tfp.stats.covariance(gen_features)
        
        # Calculate FID
        diff = mu_real - mu_gen
        fid_score = tf.reduce_sum(diff**2) + tf.linalg.trace(
            sigma_real + sigma_gen - 2 * tf.linalg.sqrtm(
                tf.matmul(sigma_real, sigma_gen)
            )
        )
        
        return float(fid_score)
    
    def calculate_kl_divergence(self, real_labels, generated_labels):
        """
        Calculate KL divergence between real and generated label distributions
        Lower KL divergence indicates better label distribution matching
        """
        print("Calculating KL divergence...")
        
        # Calculate distributions
        real_dist = self.data_processor.get_class_distribution(real_labels)
        gen_dist = self.data_processor.get_class_distribution(generated_labels)
        
        # Add small epsilon to avoid log(0)
        epsilon = 1e-8
        real_dist = real_dist + epsilon
        gen_dist = gen_dist + epsilon
        
        # Normalize to ensure they sum to 1
        real_dist = real_dist / tf.reduce_sum(real_dist)
        gen_dist = gen_dist / tf.reduce_sum(gen_dist)
        
        # Calculate KL divergence: KL(real || generated)
        kl_div = tf.reduce_sum(real_dist * tf.math.log(real_dist / gen_dist))
        
        return float(kl_div)
    
    def calculate_inception_score(self, generated_images, splits=10):
        """
        Calculate Inception Score (IS)
        Higher IS indicates better quality and diversity
        """
        if self.inception_model is None:
            print("InceptionV3 not available, skipping IS calculation")
            return None, None
            
        print("Calculating Inception Score...")
        
        # Preprocess images
        processed_images = self.preprocess_images_for_inception(generated_images)
        if processed_images is None:
            return None, None
        
        # Get predictions
        preds = self.inception_model(processed_images)
        preds = tf.nn.softmax(preds)
        
        # Calculate IS for each split
        split_size = len(preds) // splits
        scores = []
        
        for i in range(splits):
            start_idx = i * split_size
            if i == splits - 1:
                end_idx = len(preds)
            else:
                end_idx = (i + 1) * split_size
            
            split_preds = preds[start_idx:end_idx]
            
            # Calculate marginal probability
            p_y = tf.reduce_mean(split_preds, axis=0)
            
            # Calculate KL divergence for each image
            kl_divs = []
            for j in range(len(split_preds)):
                kl_div = tf.reduce_sum(split_preds[j] * tf.math.log(
                    split_preds[j] / (p_y + 1e-8) + 1e-8
                ))
                kl_divs.append(kl_div)
            
            # Calculate IS for this split
            is_score = tf.exp(tf.reduce_mean(kl_divs))
            scores.append(float(is_score))
        
        mean_is = np.mean(scores)
        std_is = np.std(scores)
        
        return mean_is, std_is
    
    def plot_training_history(self, history, save_path=None):
        """Plot comprehensive training history"""
        plt.figure(figsize=(20, 12))
        
        epochs = history['epoch']
        
        # Generator and Discriminator Loss
        plt.subplot(2, 3, 1)
        plt.plot(epochs, history['gen_loss'], label='Generator Loss', color='blue', linewidth=2)
        plt.plot(epochs, history['disc_loss'], label='Discriminator Loss', color='red', linewidth=2)
        plt.title('Generator vs Discriminator Loss', fontsize=14, fontweight='bold')
        plt.xlabel('Epoch')
        plt.ylabel('Loss')
        plt.legend()
        plt.grid(True, alpha=0.3)
        
        # Label Classification Accuracy
        plt.subplot(2, 3, 2)
        plt.plot(epochs, history['label_accuracy'], label='Label Accuracy', color='green', linewidth=2)
        plt.title('Label Classification Accuracy', fontsize=14, fontweight='bold')
        plt.xlabel('Epoch')
        plt.ylabel('Accuracy')
        plt.legend()
        plt.grid(True, alpha=0.3)
        
        # FID Score (if available)
        if 'fid_score' in history:
            plt.subplot(2, 3, 3)
            plt.plot(epochs, history['fid_score'], label='FID Score', color='purple', linewidth=2)
            plt.title('FID Score (Lower is Better)', fontsize=14, fontweight='bold')
            plt.xlabel('Epoch')
            plt.ylabel('FID Score')
            plt.legend()
            plt.grid(True, alpha=0.3)
        
        # KL Divergence (if available)
        if 'kl_divergence' in history:
            plt.subplot(2, 3, 4)
            plt.plot(epochs, history['kl_divergence'], label='KL Divergence', color='orange', linewidth=2)
            plt.title('KL Divergence (Lower is Better)', fontsize=14, fontweight='bold')
            plt.xlabel('Epoch')
            plt.ylabel('KL Divergence')
            plt.legend()
            plt.grid(True, alpha=0.3)
        
        # Inception Score (if available)
        if 'inception_score' in history:
            plt.subplot(2, 3, 5)
            plt.plot(epochs, history['inception_score'], label='Inception Score', color='brown', linewidth=2)
            plt.title('Inception Score (Higher is Better)', fontsize=14, fontweight='bold')
            plt.xlabel('Epoch')
            plt.ylabel('IS')
            plt.legend()
            plt.grid(True, alpha=0.3)
        
        # Loss Difference
        plt.subplot(2, 3, 6)
        loss_diff = np.array(history['gen_loss']) - np.array(history['disc_loss'])
        plt.plot(epochs, loss_diff, label='Gen Loss - Disc Loss', color='magenta', linewidth=2)
        plt.axhline(y=0, color='black', linestyle='--', alpha=0.5)
        plt.title('Loss Difference (Generator - Discriminator)', fontsize=14, fontweight='bold')
        plt.xlabel('Epoch')
        plt.ylabel('Loss Difference')
        plt.legend()
        plt.grid(True, alpha=0.3)
        
        plt.tight_layout()
        
        if save_path:
            plt.savefig(save_path, dpi=300, bbox_inches='tight')
            print(f"Training history saved to {save_path}")
        
        plt.show()
    
    def visualize_generated_samples(self, generator, num_samples_per_class=3, class_subset=None):
        """Visualize generated samples for each class"""
        if class_subset is None:
            class_subset = list(range(self.data_processor.num_classes))
        
        print(f"Generating samples for visualization...")
        
        # Generate samples
        all_images = []
        all_labels = []
        
        for class_idx in class_subset:
            noise = tf.random.normal([num_samples_per_class, 100])
            labels = tf.fill([num_samples_per_class], class_idx)
            
            generated = generator([noise, labels], training=False)
            all_images.append(generated)
            all_labels.extend([class_idx] * num_samples_per_class)
        
        generated_images = tf.concat(all_images, axis=0)
        
        # Create visualization
        n_classes = len(class_subset)
        n_cols = min(8, n_classes)
        n_rows = (n_classes + n_cols - 1) // n_cols
        
        fig = plt.figure(figsize=(n_cols * 2, n_rows * 2 * num_samples_per_class))
        fig.suptitle('Generated Letter Samples by Class', fontsize=16, fontweight='bold')
        
        for idx, class_idx in enumerate(class_subset):
            for sample_idx in range(num_samples_per_class):
                img_idx = idx * num_samples_per_class + sample_idx
                
                row = (idx // n_cols) * num_samples_per_class + sample_idx
                col = idx % n_cols
                
                ax = plt.subplot(n_rows * num_samples_per_class, n_cols, row * n_cols + col + 1)
                
                # Display image
                img = generated_images[img_idx].numpy().squeeze()
                ax.imshow(img, cmap='gray', vmin=-1, vmax=1)
                ax.axis('off')
                
                # Add class label on first sample
                if sample_idx == 0:
                    letter = self.data_processor.class_to_letter.get(class_idx, f'Class_{class_idx}')
                    ax.set_title(f'{letter} (Class {class_idx})', fontsize=10, fontweight='bold')
        
        plt.tight_layout()
        plt.show()
        
        return generated_images, np.array(all_labels)

# Try to import tensorflow_probability for advanced statistics
try:
    import tensorflow_probability as tfp
    print("TensorFlow Probability available for advanced metrics")
except ImportError:
    print("TensorFlow Probability not available - some metrics may be limited")
    # Create a minimal replacement
    class tfp:
        class stats:
            @staticmethod
            def covariance(x):
                mean_x = tf.reduce_mean(x, axis=0, keepdims=True)
                x_centered = x - mean_x
                return tf.matmul(x_centered, x_centered, transpose_a=True) / (tf.cast(tf.shape(x)[0], tf.float32) - 1)

# Initialize evaluation metrics
evaluation_metrics = GANEvaluationMetrics(data_processor)
print("Evaluation metrics ready!")
TensorFlow Probability not available - some metrics may be limited
InceptionV3 model loaded for FID calculation
Evaluation metrics ready!
In [18]:
# =============================================================================
# BASELINE DCGAN TRAINER AND VISUALIZATION
# =============================================================================
# Define training parameters
BATCH_SIZE = 64
BUFFER_SIZE = 1000

class BaselineGANTrainer:
    """
    Baseline GAN training pipeline following standard DCGAN training procedure
    """
    
    def __init__(self, generator, discriminator, latent_dim=100, num_classes=16):
        self.generator = generator
        self.discriminator = discriminator
        self.latent_dim = latent_dim
        self.num_classes = num_classes
        
        # Standard DCGAN optimizers (as per original paper)
        self.gen_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5)
        self.disc_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5)
        
        # Loss functions
        self.bce_loss = tf.keras.losses.BinaryCrossentropy(from_logits=False)
        self.categorical_loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
        
        # Training history
        self.history = {
            'gen_loss': [],
            'disc_loss': [],
            'label_accuracy': [],
            'epoch': []
        }
    
    @tf.function
    def train_discriminator_step(self, real_images, real_labels, batch_size):
        """Single discriminator training step"""
        # Generate fake images
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], 0, self.num_classes, dtype=tf.int32)
        fake_images = self.generator([noise, fake_labels], training=True)
        
        with tf.GradientTape() as tape:
            # Real images
            real_validity, real_label_pred = self.discriminator([real_images, real_labels], training=True)
            
            # Fake images
            fake_validity, fake_label_pred = self.discriminator([fake_images, fake_labels], training=True)
            
            # Adversarial losses
            real_loss = self.bce_loss(tf.ones_like(real_validity), real_validity)
            fake_loss = self.bce_loss(tf.zeros_like(fake_validity), fake_validity)
            
            # Label classification losses
            real_label_loss = self.categorical_loss(real_labels, real_label_pred)
            fake_label_loss = self.categorical_loss(fake_labels, fake_label_pred)
            
            # Total discriminator loss
            disc_loss = (real_loss + fake_loss) / 2 + (real_label_loss + fake_label_loss) / 2
        
        # Update discriminator
        gradients = tape.gradient(disc_loss, self.discriminator.trainable_variables)
        self.disc_optimizer.apply_gradients(zip(gradients, self.discriminator.trainable_variables))
        
        return disc_loss, real_label_pred, real_labels
    
    @tf.function
    def train_generator_step(self, batch_size):
        """Single generator training step"""
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], 0, self.num_classes, dtype=tf.int32)
        
        with tf.GradientTape() as tape:
            fake_images = self.generator([noise, fake_labels], training=True)
            fake_validity, fake_label_pred = self.discriminator([fake_images, fake_labels], training=True)
            
            # Generator wants discriminator to classify fake images as real
            adversarial_loss = self.bce_loss(tf.ones_like(fake_validity), fake_validity)
            
            # Generator wants correct label classification
            label_loss = self.categorical_loss(fake_labels, fake_label_pred)
            
            # Total generator loss
            gen_loss = adversarial_loss + label_loss
        
        # Update generator
        gradients = tape.gradient(gen_loss, self.generator.trainable_variables)
        self.gen_optimizer.apply_gradients(zip(gradients, self.generator.trainable_variables))
        
        return gen_loss
    
    def train_epoch(self, dataset, epoch, steps_per_epoch):
        """Train for one epoch"""
        gen_loss_avg = tf.keras.metrics.Mean()
        disc_loss_avg = tf.keras.metrics.Mean()
        accuracy_avg = tf.keras.metrics.SparseCategoricalAccuracy()
        
        # Training progress bar
        progress_bar = tqdm(enumerate(dataset.take(steps_per_epoch)), 
                           total=steps_per_epoch, 
                           desc=f"Baseline Epoch {epoch + 1}")
        
        for step, (real_images, real_labels) in progress_bar:
            batch_size = tf.shape(real_images)[0]
            
            # Train discriminator
            disc_loss, real_pred, real_labels_batch = self.train_discriminator_step(
                real_images, real_labels, batch_size
            )
            
            # Train generator
            gen_loss = self.train_generator_step(batch_size)
            
            # Update metrics
            gen_loss_avg.update_state(gen_loss)
            disc_loss_avg.update_state(disc_loss)
            accuracy_avg.update_state(real_labels_batch, real_pred)
            
            # Update progress bar
            if step % 20 == 0:
                progress_bar.set_postfix({
                    'Gen Loss': f'{gen_loss_avg.result():.4f}',
                    'Disc Loss': f'{disc_loss_avg.result():.4f}',
                    'Accuracy': f'{accuracy_avg.result():.4f}'
                })
        
        # Store epoch results
        final_gen_loss = float(gen_loss_avg.result())
        final_disc_loss = float(disc_loss_avg.result())
        final_accuracy = float(accuracy_avg.result())
        
        self.history['gen_loss'].append(final_gen_loss)
        self.history['disc_loss'].append(final_disc_loss)
        self.history['label_accuracy'].append(final_accuracy)
        self.history['epoch'].append(epoch)
        
        print(f"Epoch {epoch + 1} - Gen Loss: {final_gen_loss:.4f}, Disc Loss: {final_disc_loss:.4f}, Accuracy: {final_accuracy:.4f}")

def display_generated_samples_grid_baseline(generator, class_to_letter, epoch=None, samples_per_class=6):
    """
    Display generated samples for all classes in a grid format
    Same visualization as enhanced version for comparison
    """
    print(f"Generating {samples_per_class} samples per class for all {len(class_to_letter)} letter classes...")
    
    # Generate samples for each class
    all_images = []
    all_labels = []
    
    for class_num in sorted(class_to_letter.keys()):
        # Generate noise and labels for this class
        noise = tf.random.normal([samples_per_class, 100])  # latent_dim = 100
        labels = tf.fill([samples_per_class], class_num)
        
        # Generate images
        generated_imgs = generator([noise, labels], training=False)
        
        all_images.append(generated_imgs)
        all_labels.extend([class_num] * samples_per_class)
    
    # Concatenate all generated images
    all_images = tf.concat(all_images, axis=0)
    
    # Convert to numpy for plotting
    generated_images = (all_images.numpy() + 1) / 2.0  # Convert from [-1,1] to [0,1]
    
    # Create the grid plot
    n_classes = len(class_to_letter)
    n_samples = samples_per_class
    
    # Create figure
    fig, axes = plt.subplots(n_samples, n_classes, figsize=(n_classes * 1.2, n_samples * 1.2))
    
    # Set title
    title = f"Generated Letter Samples - All Classes"
    if epoch is not None:
        title += f" (Epoch {epoch})"
    title += f"\\n{samples_per_class} samples per class"
    fig.suptitle(title, fontsize=14, fontweight='bold')
    
    # Plot images
    for class_idx, (class_num, letter) in enumerate(sorted(class_to_letter.items())):
        # Add column header
        axes[0, class_idx].set_title(f"{letter} (Class {class_num})", fontsize=10, fontweight='bold')
        
        # Plot samples for this class
        for sample_idx in range(samples_per_class):
            img_idx = class_idx * samples_per_class + sample_idx
            
            ax = axes[sample_idx, class_idx]
            ax.imshow(generated_images[img_idx, :, :, 0], cmap='gray')
            ax.axis('off')
    
    plt.tight_layout()
    plt.show()
    
    return all_images, all_labels

# Initialize baseline trainer
print("Initializing Baseline GAN Trainer...")
baseline_trainer = BaselineGANTrainer(
    generator=baseline_generator,
    discriminator=baseline_discriminator,
    latent_dim=100,
    num_classes=num_classes
)

# Create TensorFlow dataset for efficient training
train_dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.batch(BATCH_SIZE, drop_remainder=True)
train_dataset = train_dataset.prefetch(tf.data.AUTOTUNE)

# Calculate steps per epoch
steps_per_epoch = len(X_train) // BATCH_SIZE
Initializing Baseline GAN Trainer...
In [18]:
# =============================================================================
# BASELINE DCGAN TRAINING - COMPLETE 50 EPOCH TRAINING
# =============================================================================

# Test visualization first
print("Testing visualization with current baseline models...")
test_images, test_labels = display_generated_samples_grid_baseline(
    baseline_generator, class_to_letter, samples_per_class=6
)

# Training configuration
NUM_EPOCHS = 50

start_time = time.time()

for epoch in range(NUM_EPOCHS):
    epoch_start = time.time()
    
    # Train for one epoch
    baseline_trainer.train_epoch(train_dataset, epoch, steps_per_epoch)
    
    # Display only on the last epoch
    if (epoch + 1) == NUM_EPOCHS:
        print(f"\nGenerating Grid of Samples at Final Epoch {epoch + 1}:")
        display_generated_samples_grid_baseline(baseline_generator, class_to_letter, epoch + 1, samples_per_class=6)
    
    # Calculate and display epoch timing
    epoch_time = time.time() - epoch_start
    total_time = time.time() - start_time
    avg_time = total_time / (epoch + 1)
    eta = avg_time * (NUM_EPOCHS - epoch - 1)
    
total_training_time = time.time() - start_time


# Generate final display
print(f"\nFinal Generated Samples - All Letter Classes:")
final_images, final_labels = display_generated_samples_grid_baseline(
    baseline_generator, class_to_letter, NUM_EPOCHS, samples_per_class=6
)

# Display comprehensive training summary
available_letters = sorted(class_to_letter.values())
missing_letters = [letter for letter in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' if letter not in available_letters]

# Plot training progress
if len(baseline_trainer.history['gen_loss']) > 1:
    plt.figure(figsize=(15, 5))
    
    # Generator and Discriminator Loss
    plt.subplot(1, 3, 1)
    epochs = baseline_trainer.history['epoch']
    plt.plot(epochs, baseline_trainer.history['gen_loss'], label='Generator Loss', color='blue', linewidth=2)
    plt.plot(epochs, baseline_trainer.history['disc_loss'], label='Discriminator Loss', color='red', linewidth=2)
    plt.title('Baseline DCGAN - Training Losses', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Discriminator Accuracy
    plt.subplot(1, 3, 2)
    plt.plot(epochs, baseline_trainer.history['label_accuracy'], label='Discriminator Accuracy', color='green', linewidth=2)
    plt.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='Upper limit (95%)')
    plt.axhline(y=0.70, color='orange', linestyle='--', alpha=0.7, label='Lower limit (70%)')
    plt.title('Baseline DCGAN - Discriminator Accuracy', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Loss Difference
    plt.subplot(1, 3, 3)
    loss_diff = [g - d for g, d in zip(baseline_trainer.history['gen_loss'], baseline_trainer.history['disc_loss'])]
    plt.plot(epochs, loss_diff, label='Gen Loss - Disc Loss', color='purple', linewidth=2)
    plt.axhline(y=0, color='black', linestyle='-', alpha=0.5)
    plt.title('Baseline DCGAN - Loss Balance', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss Difference')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
Testing visualization with current baseline models...
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Baseline Epoch 1: 100%|██████████| 597/597 [00:58<00:00, 10.12it/s, Gen Loss=2.4564, Disc Loss=1.4497, Accuracy=0.7198]
Epoch 1 - Gen Loss: 2.4375, Disc Loss: 1.4306, Accuracy: 0.7254
Baseline Epoch 2: 100%|██████████| 597/597 [00:48<00:00, 12.25it/s, Gen Loss=1.4203, Disc Loss=0.6453, Accuracy=0.9699]
Epoch 2 - Gen Loss: 1.4160, Disc Loss: 0.6447, Accuracy: 0.9705
Baseline Epoch 3: 100%|██████████| 597/597 [00:51<00:00, 11.59it/s, Gen Loss=1.4047, Disc Loss=0.5791, Accuracy=0.9959]
Epoch 3 - Gen Loss: 1.4021, Disc Loss: 0.5786, Accuracy: 0.9959
Baseline Epoch 4: 100%|██████████| 597/597 [00:49<00:00, 12.06it/s, Gen Loss=1.3079, Disc Loss=0.5819, Accuracy=0.9990]
Epoch 4 - Gen Loss: 1.3076, Disc Loss: 0.5812, Accuracy: 0.9990
Baseline Epoch 5: 100%|██████████| 597/597 [00:46<00:00, 12.77it/s, Gen Loss=1.2661, Disc Loss=0.5754, Accuracy=0.9994]
Epoch 5 - Gen Loss: 1.2650, Disc Loss: 0.5756, Accuracy: 0.9993
Baseline Epoch 6: 100%|██████████| 597/597 [00:47<00:00, 12.67it/s, Gen Loss=1.3042, Disc Loss=0.5609, Accuracy=0.9998]
Epoch 6 - Gen Loss: 1.3026, Disc Loss: 0.5603, Accuracy: 0.9998
Baseline Epoch 7: 100%|██████████| 597/597 [00:51<00:00, 11.69it/s, Gen Loss=1.3049, Disc Loss=0.5543, Accuracy=0.9999]
Epoch 7 - Gen Loss: 1.3044, Disc Loss: 0.5551, Accuracy: 0.9999
Baseline Epoch 8: 100%|██████████| 597/597 [00:47<00:00, 12.62it/s, Gen Loss=1.2989, Disc Loss=0.5558, Accuracy=0.9997]
Epoch 8 - Gen Loss: 1.3006, Disc Loss: 0.5560, Accuracy: 0.9997
Baseline Epoch 9: 100%|██████████| 597/597 [00:50<00:00, 11.85it/s, Gen Loss=1.3031, Disc Loss=0.5515, Accuracy=0.9999]
Epoch 9 - Gen Loss: 1.3036, Disc Loss: 0.5513, Accuracy: 0.9999
Baseline Epoch 10: 100%|██████████| 597/597 [00:49<00:00, 12.15it/s, Gen Loss=1.2987, Disc Loss=0.5536, Accuracy=0.9999]
Epoch 10 - Gen Loss: 1.2971, Disc Loss: 0.5538, Accuracy: 0.9999
Baseline Epoch 11: 100%|██████████| 597/597 [00:47<00:00, 12.66it/s, Gen Loss=1.3250, Disc Loss=0.5457, Accuracy=0.9999]
Epoch 11 - Gen Loss: 1.3236, Disc Loss: 0.5458, Accuracy: 0.9999
Baseline Epoch 12: 100%|██████████| 597/597 [00:51<00:00, 11.65it/s, Gen Loss=1.3343, Disc Loss=0.5464, Accuracy=0.9999]
Epoch 12 - Gen Loss: 1.3348, Disc Loss: 0.5478, Accuracy: 0.9999
Baseline Epoch 13: 100%|██████████| 597/597 [00:49<00:00, 12.11it/s, Gen Loss=1.3706, Disc Loss=0.5409, Accuracy=0.9998]
Epoch 13 - Gen Loss: 1.3682, Disc Loss: 0.5415, Accuracy: 0.9998
Baseline Epoch 14: 100%|██████████| 597/597 [00:50<00:00, 11.71it/s, Gen Loss=1.3536, Disc Loss=0.5417, Accuracy=0.9999]
Epoch 14 - Gen Loss: 1.3533, Disc Loss: 0.5412, Accuracy: 0.9999
Baseline Epoch 15: 100%|██████████| 597/597 [00:49<00:00, 12.09it/s, Gen Loss=1.3958, Disc Loss=0.5325, Accuracy=1.0000]
Epoch 15 - Gen Loss: 1.3959, Disc Loss: 0.5322, Accuracy: 1.0000
Baseline Epoch 16: 100%|██████████| 597/597 [00:53<00:00, 11.07it/s, Gen Loss=1.3987, Disc Loss=0.5371, Accuracy=0.9999]
Epoch 16 - Gen Loss: 1.3978, Disc Loss: 0.5374, Accuracy: 0.9999
Baseline Epoch 17: 100%|██████████| 597/597 [00:49<00:00, 12.18it/s, Gen Loss=1.3911, Disc Loss=0.5359, Accuracy=0.9999]
Epoch 17 - Gen Loss: 1.3910, Disc Loss: 0.5357, Accuracy: 0.9999
Baseline Epoch 18: 100%|██████████| 597/597 [00:53<00:00, 11.13it/s, Gen Loss=1.4430, Disc Loss=0.5257, Accuracy=1.0000]
Epoch 18 - Gen Loss: 1.4402, Disc Loss: 0.5261, Accuracy: 1.0000
Baseline Epoch 19: 100%|██████████| 597/597 [00:53<00:00, 11.20it/s, Gen Loss=1.4619, Disc Loss=0.5235, Accuracy=1.0000]
Epoch 19 - Gen Loss: 1.4656, Disc Loss: 0.5236, Accuracy: 1.0000
Baseline Epoch 20: 100%|██████████| 597/597 [00:54<00:00, 11.01it/s, Gen Loss=1.5014, Disc Loss=0.5165, Accuracy=1.0000]
Epoch 20 - Gen Loss: 1.5007, Disc Loss: 0.5166, Accuracy: 1.0000
Baseline Epoch 21: 100%|██████████| 597/597 [00:53<00:00, 11.08it/s, Gen Loss=1.5407, Disc Loss=0.5105, Accuracy=1.0000]
Epoch 21 - Gen Loss: 1.5394, Disc Loss: 0.5101, Accuracy: 1.0000
Baseline Epoch 22: 100%|██████████| 597/597 [00:52<00:00, 11.42it/s, Gen Loss=1.5242, Disc Loss=0.5123, Accuracy=0.9999]
Epoch 22 - Gen Loss: 1.5211, Disc Loss: 0.5122, Accuracy: 0.9999
Baseline Epoch 23: 100%|██████████| 597/597 [00:54<00:00, 10.93it/s, Gen Loss=1.5693, Disc Loss=0.4954, Accuracy=0.9999]
Epoch 23 - Gen Loss: 1.5712, Disc Loss: 0.4959, Accuracy: 0.9999
Baseline Epoch 24: 100%|██████████| 597/597 [00:54<00:00, 10.93it/s, Gen Loss=1.5886, Disc Loss=0.4974, Accuracy=0.9999]
Epoch 24 - Gen Loss: 1.5867, Disc Loss: 0.4981, Accuracy: 0.9999
Baseline Epoch 25: 100%|██████████| 597/597 [00:53<00:00, 11.14it/s, Gen Loss=1.5976, Disc Loss=0.4991, Accuracy=0.9999]
Epoch 25 - Gen Loss: 1.5988, Disc Loss: 0.4995, Accuracy: 0.9999
Baseline Epoch 26: 100%|██████████| 597/597 [00:53<00:00, 11.19it/s, Gen Loss=1.6516, Disc Loss=0.4891, Accuracy=1.0000]
Epoch 26 - Gen Loss: 1.6531, Disc Loss: 0.4894, Accuracy: 1.0000
Baseline Epoch 27: 100%|██████████| 597/597 [00:53<00:00, 11.14it/s, Gen Loss=1.6739, Disc Loss=0.4839, Accuracy=1.0000]
Epoch 27 - Gen Loss: 1.6735, Disc Loss: 0.4862, Accuracy: 1.0000
Baseline Epoch 28: 100%|██████████| 597/597 [00:53<00:00, 11.20it/s, Gen Loss=1.6874, Disc Loss=0.4825, Accuracy=0.9999]
Epoch 28 - Gen Loss: 1.6895, Disc Loss: 0.4831, Accuracy: 0.9999
Baseline Epoch 29: 100%|██████████| 597/597 [00:51<00:00, 11.55it/s, Gen Loss=1.6793, Disc Loss=0.4757, Accuracy=1.0000]
Epoch 29 - Gen Loss: 1.6847, Disc Loss: 0.4761, Accuracy: 1.0000
Baseline Epoch 30: 100%|██████████| 597/597 [00:54<00:00, 11.03it/s, Gen Loss=1.6981, Disc Loss=0.4760, Accuracy=1.0000]
Epoch 30 - Gen Loss: 1.6957, Disc Loss: 0.4761, Accuracy: 1.0000
Baseline Epoch 31: 100%|██████████| 597/597 [00:53<00:00, 11.16it/s, Gen Loss=1.7301, Disc Loss=0.4752, Accuracy=1.0000]
Epoch 31 - Gen Loss: 1.7304, Disc Loss: 0.4743, Accuracy: 1.0000
Baseline Epoch 32: 100%|██████████| 597/597 [00:54<00:00, 11.03it/s, Gen Loss=1.7511, Disc Loss=0.4747, Accuracy=1.0000]
Epoch 32 - Gen Loss: 1.7477, Disc Loss: 0.4753, Accuracy: 1.0000
Baseline Epoch 33: 100%|██████████| 597/597 [00:54<00:00, 11.00it/s, Gen Loss=1.7418, Disc Loss=0.4664, Accuracy=1.0000]
Epoch 33 - Gen Loss: 1.7431, Disc Loss: 0.4662, Accuracy: 1.0000
Baseline Epoch 34: 100%|██████████| 597/597 [00:54<00:00, 10.90it/s, Gen Loss=1.7243, Disc Loss=0.4692, Accuracy=1.0000]
Epoch 34 - Gen Loss: 1.7198, Disc Loss: 0.4699, Accuracy: 1.0000
Baseline Epoch 35: 100%|██████████| 597/597 [00:54<00:00, 11.01it/s, Gen Loss=1.7807, Disc Loss=0.4692, Accuracy=0.9999]
Epoch 35 - Gen Loss: 1.7764, Disc Loss: 0.4697, Accuracy: 0.9999
Baseline Epoch 36: 100%|██████████| 597/597 [00:53<00:00, 11.07it/s, Gen Loss=1.7521, Disc Loss=0.4657, Accuracy=1.0000]
Epoch 36 - Gen Loss: 1.7530, Disc Loss: 0.4664, Accuracy: 1.0000
Baseline Epoch 37: 100%|██████████| 597/597 [00:45<00:00, 13.05it/s, Gen Loss=1.8083, Disc Loss=0.4560, Accuracy=0.9999]
Epoch 37 - Gen Loss: 1.8065, Disc Loss: 0.4565, Accuracy: 0.9999
Baseline Epoch 38: 100%|██████████| 597/597 [00:25<00:00, 23.61it/s, Gen Loss=1.8075, Disc Loss=0.4600, Accuracy=0.9999]
Epoch 38 - Gen Loss: 1.8062, Disc Loss: 0.4602, Accuracy: 0.9999
Baseline Epoch 39: 100%|██████████| 597/597 [00:25<00:00, 23.23it/s, Gen Loss=1.8461, Disc Loss=0.4548, Accuracy=0.9999]
Epoch 39 - Gen Loss: 1.8456, Disc Loss: 0.4543, Accuracy: 0.9999
Baseline Epoch 40: 100%|██████████| 597/597 [00:26<00:00, 22.35it/s, Gen Loss=1.8436, Disc Loss=0.4484, Accuracy=1.0000]
Epoch 40 - Gen Loss: 1.8434, Disc Loss: 0.4480, Accuracy: 1.0000
Baseline Epoch 41: 100%|██████████| 597/597 [00:26<00:00, 22.29it/s, Gen Loss=1.8502, Disc Loss=0.4508, Accuracy=1.0000]
Epoch 41 - Gen Loss: 1.8521, Disc Loss: 0.4517, Accuracy: 1.0000
Baseline Epoch 42: 100%|██████████| 597/597 [00:26<00:00, 22.74it/s, Gen Loss=1.9027, Disc Loss=0.4426, Accuracy=1.0000]
Epoch 42 - Gen Loss: 1.9050, Disc Loss: 0.4421, Accuracy: 1.0000
Baseline Epoch 43: 100%|██████████| 597/597 [00:26<00:00, 22.63it/s, Gen Loss=1.8602, Disc Loss=0.4516, Accuracy=1.0000]
Epoch 43 - Gen Loss: 1.8583, Disc Loss: 0.4515, Accuracy: 1.0000
Baseline Epoch 44: 100%|██████████| 597/597 [00:26<00:00, 22.38it/s, Gen Loss=1.8741, Disc Loss=0.4508, Accuracy=1.0000]
Epoch 44 - Gen Loss: 1.8755, Disc Loss: 0.4516, Accuracy: 1.0000
Baseline Epoch 45: 100%|██████████| 597/597 [00:26<00:00, 22.14it/s, Gen Loss=1.8602, Disc Loss=0.4456, Accuracy=1.0000]
Epoch 45 - Gen Loss: 1.8620, Disc Loss: 0.4455, Accuracy: 1.0000
Baseline Epoch 46: 100%|██████████| 597/597 [00:26<00:00, 22.40it/s, Gen Loss=1.8686, Disc Loss=0.4492, Accuracy=0.9999]
Epoch 46 - Gen Loss: 1.8642, Disc Loss: 0.4491, Accuracy: 0.9999
Baseline Epoch 47: 100%|██████████| 597/597 [00:26<00:00, 22.30it/s, Gen Loss=1.8764, Disc Loss=0.4468, Accuracy=1.0000]
Epoch 47 - Gen Loss: 1.8784, Disc Loss: 0.4469, Accuracy: 1.0000
Baseline Epoch 48: 100%|██████████| 597/597 [00:27<00:00, 21.96it/s, Gen Loss=1.8987, Disc Loss=0.4433, Accuracy=0.9999]
Epoch 48 - Gen Loss: 1.8920, Disc Loss: 0.4430, Accuracy: 0.9999
Baseline Epoch 49: 100%|██████████| 597/597 [00:27<00:00, 21.72it/s, Gen Loss=1.9094, Disc Loss=0.4438, Accuracy=1.0000]
Epoch 49 - Gen Loss: 1.9082, Disc Loss: 0.4438, Accuracy: 1.0000
Baseline Epoch 50: 100%|██████████| 597/597 [00:27<00:00, 21.84it/s, Gen Loss=1.9038, Disc Loss=0.4422, Accuracy=1.0000]
Epoch 50 - Gen Loss: 1.9036, Disc Loss: 0.4423, Accuracy: 1.0000

Generating Grid of Samples at Final Epoch 50:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Final Generated Samples - All Letter Classes:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
No description has been provided for this image

5.3.1 Baseline DCGAN Results Analysis¶

Comprehensive Performance Assessment¶

The baseline DCGAN implementation reveals several critical training characteristics that inform our understanding of standard GAN dynamics and motivate the need for enhanced techniques.

Visual Quality Assessment¶

Generated Sample Analysis:

  • Recognizable Features: Most generated letters display fundamental alphabetic structures
  • Class Fidelity: Letters generally correspond to their intended classes
  • Quality Variation: Significant variance in generation quality across different letter classes
  • Structural Issues: Some letters (G, Q, N) exhibit distortion, incomplete features, or inconsistent stroke patterns

Problematic Cases Identified:

  • Complex Letters: Characters with intricate features (G, Q) show degraded quality
  • Similar Characters: Potential confusion between visually similar letters
  • Stroke Consistency: Inconsistent line thickness and connectivity across samples
  • Spatial Alignment: Some letters appear off-center or poorly positioned

Training Dynamics Analysis¶

Loss Trajectory Interpretation: The training progression reveals classic GAN training imbalance issues:

  1. Discriminator Dominance Pattern:

    • Discriminator loss steadily decreasing throughout training
    • Generator loss increasing over time
    • Widening gap indicates discriminator becoming overly powerful
  2. Performance Metrics:

    • Discriminator Accuracy: Reaching 100% indicates over-optimization
    • Generator Feedback: Poor gradient quality due to saturated discriminator
    • Training Imbalance: Clear evidence of unstable adversarial dynamics

Technical Root Cause Analysis¶

Discriminator Over-Training Issues:

  • Perfect Classification: 100% accuracy suggests discriminator memorization rather than generalization
  • Gradient Saturation: Saturated discriminator provides minimal learning signal to generator
  • Information Imbalance: Discriminator learns faster than generator can adapt

Generator Learning Constraints:

  • Poor Gradient Flow: Saturated discriminator outputs provide weak gradients
  • Limited Exploration: Generator cannot effectively explore latent space
  • Mode Collapse Risk: Tendency toward safe but limited generation patterns

Implications for Enhanced Implementation¶

Identified Enhancement Requirements:

  1. Training Balance Mechanisms:

    • Learning Rate Adjustment: Slower discriminator training to prevent dominance
    • Training Frequency Control: Modified update ratios between networks
    • Label Smoothing: Softer targets to prevent discriminator over-confidence
  2. Architectural Improvements:

    • Regularization Techniques: Spectral normalization for training stability
    • Advanced Blocks: Residual connections for better gradient flow
    • Attention Mechanisms: Self-attention for improved feature learning
  3. Training Stability Enhancements:

    • Noise Injection: Adding noise to discriminator inputs
    • Progressive Training: Gradual complexity increase
    • Advanced Loss Functions: Alternative objectives for better balance

Baseline Performance Benchmarks¶

This baseline establishes quantitative benchmarks for our enhanced implementations:

  • Final Generator Loss: Baseline comparison metric
  • Final Discriminator Loss: Training balance indicator
  • Visual Quality Score: Subjective quality assessment baseline
  • Training Stability: Convergence pattern reference point

The analysis of these baseline results directly informs the design decisions for our enhanced DCGAN implementation, specifically targeting the identified training imbalance and quality issues through sophisticated architectural and training improvements.


5.4 Advanced GAN Techniques and Architectural Enhancements¶

5.4.1 Spectral Normalization for Training Stability¶

Theoretical Foundation and Mathematical Framework¶

Spectral Normalization represents a critical advancement in GAN training stability, introduced by Miyato et al. in 2018. This technique addresses fundamental issues in neural network optimization by controlling the Lipschitz constant of neural network layers, leading to more stable and reliable training dynamics.

Mathematical Principles¶

Spectral Norm Definition: The spectral norm of a weight matrix W is defined as its largest singular value:

||W||₂ = max(|λ|) where λ are eigenvalues of W^T W

Equivalently: ||W||₂ = σₘₐₓ(W) where σₘₐₓ is the maximum singular value.

Lipschitz Constraint Enforcement: For a neural network layer f(x) = W·x, the Lipschitz constant L satisfies:

||f(x₁) - f(x₂)||₂ ≤ L||x₁ - x₂||₂

Spectral normalization ensures L ≤ 1 by normalizing: W_normalized = W / ||W||₂

Implementation Algorithm¶

Power Iteration Method: The spectral norm is efficiently computed using power iteration:

  1. Initialize: Random vector u with ||u||₂ = 1
  2. Iterate: For k iterations:
    • v = W^T u / ||W^T u||₂
    • u = W v / ||W v||₂
  3. Estimate: σₘₐₓ ≈ u^T W v

Computational Efficiency:

  • Convergence: Power iteration converges rapidly (typically 1-3 iterations sufficient)
  • Memory: Minimal overhead storing only u vector
  • Speed: Negligible computational cost during training

Applications in GAN Training¶

Training Stability Benefits¶

Gradient Flow Stabilization:

  • Prevents Exploding Gradients: Bounded spectral norms ensure gradient magnitudes remain controlled
  • Consistent Learning: Stable gradient flow enables reliable parameter updates
  • Convergence Reliability: Reduced risk of training divergence or oscillation

Discriminator Regularization:

  • Prevents Over-Optimization: Controlled Lipschitz constant prevents discriminator from becoming too powerful
  • Balanced Competition: Maintains healthy adversarial dynamic between generator and discriminator
  • Gradient Quality: Ensures discriminator provides meaningful learning signals to generator

Practical Implementation Details¶

Layer Application Strategy:

  • Convolution Layers: Applied to all Conv2D layers in discriminator
  • Dense Layers: Applied to fully connected layers
  • Selective Application: Typically applied only to discriminator, not generator

Integration with Training:

  • Real-time Normalization: Applied during both forward and backward passes
  • Parameter Updates: Normalization occurs before gradient computation
  • Minimal Overhead: Efficient implementation with negligible performance impact

Advantages Over Alternative Approaches¶

Comparison with Other Regularization Methods¶

vs. Weight Decay:

  • More Principled: Directly controls network behavior rather than parameter magnitude
  • Training Specific: Designed specifically for neural network stability issues
  • Better Performance: Empirically superior results in GAN training scenarios

vs. Gradient Penalty:

  • Computational Efficiency: Lower computational overhead than gradient penalty methods
  • Implementation Simplicity: Easier to implement and integrate
  • Theoretical Guarantees: Strong mathematical foundation for stability claims

vs. Batch Normalization:

  • Complementary: Can be used alongside batch normalization
  • Different Objectives: BN normalizes activations, SN controls layer behavior
  • Specialized Purpose: Specifically designed for adversarial training scenarios

Expected Impact on Letter Generation¶

Quality Improvements¶

  • Feature Consistency: More stable feature learning leads to consistent letter structures
  • Reduced Artifacts: Elimination of training instabilities that cause visual artifacts
  • Better Convergence: Stable training enables longer training periods with continued improvement

Training Dynamics Enhancement¶

  • Balanced Learning: Prevented discriminator dominance allows generator to learn effectively
  • Smooth Optimization: Reduced training oscillations lead to smoother loss curves
  • Reliable Progress: Consistent training progress without sudden degradation episodes

5.4.2 Residual Blocks for Enhanced Feature Learning¶

Theoretical Foundation and Architectural Innovation¶

Residual blocks, introduced by He et al. in the groundbreaking ResNet paper (2015), represent a fundamental breakthrough in deep neural network architecture. By introducing skip connections that allow information to bypass layers, residual blocks solve critical problems in deep network training and enable the construction of much deeper, more powerful networks.

Mathematical Framework¶

Residual Learning Formulation: Instead of learning a direct mapping H(x), residual blocks learn the residual function:

F(x) = H(x) - x

The final output becomes: H(x) = F(x) + x

Skip Connection Mechanism: The residual block computes:

y = F(x, {Wᵢ}) + x

Where:

  • F(x, {Wᵢ}): Residual function learned by stacked layers
  • x: Identity mapping (skip connection)
  • y: Final output after element-wise addition

Deep Learning Benefits¶

Gradient Flow Enhancement:

  • Direct Gradient Paths: Skip connections provide direct paths for gradient propagation
  • Vanishing Gradient Prevention: Gradients can flow directly through identity mappings
  • Training Stability: Stable gradients enable training of very deep networks

Learning Efficiency:

  • Identity Learning: Easy to learn identity mappings when optimal
  • Incremental Learning: Networks learn incremental improvements rather than complete transformations
  • Feature Preservation: Important features preserved through skip connections

Application to GAN Generator Architecture¶

Generator-Specific Advantages¶

Feature Quality Enhancement:

  • Fine Detail Preservation: Skip connections preserve fine-grained features during upsampling
  • Structural Consistency: Maintain overall letter structure while adding details
  • Multi-Scale Features: Combine features from different resolution levels

Training Dynamics Improvement:

  • Stable Convergence: Improved gradient flow leads to more stable training
  • Faster Learning: Reduced training time due to efficient gradient propagation
  • Better Optimization: Smoother loss landscapes facilitate optimization

Implementation in Conv2DTranspose Context¶

Architectural Adaptation: Our implementation adapts residual blocks for generative tasks using transposed convolutions:

Input → Conv2DTranspose → BatchNorm → ReLU → Conv2DTranspose → BatchNorm → (+) → ReLU
  ↓                                                                      ↑
  └────────────────── Skip Connection ──────────────────────────────────┘

Skip Connection Handling:

  • Dimension Matching: When stride ≠ 1, additional Conv2DTranspose adjusts dimensions
  • Channel Alignment: Skip convolutions ensure channel count compatibility
  • Batch Normalization: Applied before skip connection addition

Technical Implementation Details¶

Block Configuration Parameters¶

Layer Specifications:

  • Kernel Size: 3×3 for internal convolutions (efficient receptive field)
  • Stride Control: First convolution handles upsampling, second maintains resolution
  • Channel Management: Progressive channel reduction following DCGAN principles
  • Activation Strategy: ReLU for intermediate activations, final ReLU after addition

Normalization Strategy:

  • Batch Normalization: Applied after each convolution except final output
  • Training Mode: Proper handling of training vs. inference modes
  • Momentum: Standard BatchNorm momentum for stable statistics

Integration with DCGAN Architecture¶

Generator Flow Enhancement:

  1. Initial Projection: Dense layer projects latent space to initial feature map
  2. Residual Upsampling: Sequential residual blocks perform progressive upsampling
  3. Feature Refinement: Each block refines features while preserving important information
  4. Final Convolution: Standard convolution produces final output

Architectural Benefits:

  • Deeper Networks: Enables stable training of deeper generator architectures
  • Better Features: Improved feature learning through enhanced gradient flow
  • Quality Consistency: More consistent generation quality across different samples

Expected Impact on Letter Generation¶

Quality Improvements¶

Structural Enhancement:

  • Consistent Shapes: Better preservation of fundamental letter structures
  • Fine Details: Improved rendering of strokes, curves, and connection points
  • Style Consistency: More consistent handwriting style across generated samples

Training Benefits:

  • Stable Progress: Smoother training progression without sudden quality drops
  • Faster Convergence: Reduced training time to achieve target quality levels
  • Robustness: Less sensitivity to hyperparameter choices and initialization

Comparison with Standard Architecture¶

vs. Standard Conv2DTranspose Blocks:

  • Gradient Quality: Superior gradient flow enables better parameter updates
  • Feature Learning: More sophisticated feature representations
  • Training Stability: Reduced risk of training instabilities and mode collapse
  • Scalability: Better performance scaling with increased network depth

This residual block implementation, combined with spectral normalization in the discriminator, forms the foundation of our enhanced DCGAN architecture designed to overcome the limitations observed in our baseline implementation.


5.5 Enhanced DCGAN Implementation with Advanced Techniques¶

Comprehensive Architecture Integration and Training Strategy¶

This section presents our enhanced DCGAN implementation that integrates multiple state-of-the-art techniques to address the fundamental limitations identified in our baseline analysis. Our enhanced architecture combines residual blocks, spectral normalization, self-attention mechanisms, and balanced training strategies to achieve superior letter generation quality and training stability.

Enhanced Generator Architecture¶

Advanced Component Integration¶

Residual Block Enhancement:

  • Progressive Upsampling: Structured 7×7 → 14×14 → 28×28 resolution progression
  • Feature Refinement: Each residual block preserves and enhances features from previous levels
  • Skip Connection Optimization: Carefully designed skip paths for optimal gradient flow
  • Channel Evolution: Strategic channel reduction (256 → 128 → 64 → 1) following DCGAN principles

Self-Attention Mechanism:

  • Strategic Placement: Applied at intermediate resolution (14×14) for optimal computation/benefit balance
  • Feature Relationship Learning: Captures long-range spatial dependencies in letter structures
  • Query-Key-Value Architecture: Standard attention formulation adapted for convolutional features
  • Learnable Attention Weight: Gradual attention integration through trainable gamma parameter

Architectural Flow:

Noise + Label → Dense Projection → Reshape (7×7×256) → 
Residual Block (→14×14×128) → Self-Attention → 
Residual Block (→28×28×64) → Final Conv (→28×28×1)

Technical Advantages¶

Enhanced Feature Learning:

  • Multi-Scale Integration: Residual connections combine features from multiple scales
  • Spatial Coherence: Self-attention ensures spatial consistency across letter regions
  • Gradient Optimization: Superior gradient flow enables deeper, more effective networks

Enhanced Discriminator Architecture¶

Spectral Normalization Integration¶

Comprehensive Application:

  • All Convolutional Layers: Spectral normalization applied to every Conv2D layer
  • Dense Layer Regularization: Extends to fully connected layers for complete stability
  • Training Stability: Prevents discriminator over-optimization and maintains adversarial balance

Architecture Specification:

Input (28×28×1) → SN-Conv2D (→14×14×64) → 
SN-Conv2D (→7×7×128) → SN-Conv2D (→4×4×256) → 
Flatten → Dense (→512) → Dual Outputs

Dual Output Strategy:

  • Authenticity Classification: Binary real/fake determination
  • Class Prediction: Multi-class letter classification for conditional generation
  • Joint Optimization: Simultaneous training on both objectives

Balanced Training Strategy¶

Training Imbalance Prevention¶

Our enhanced implementation addresses the critical training imbalance issues identified in the baseline through multiple complementary techniques:

Learning Rate Differentiation:

  • Generator Rate: 0.0002 (standard DCGAN rate)
  • Discriminator Rate: 0.0001 (50% reduction to prevent dominance)
  • Adaptive Balance: Maintains healthy competition throughout training

Label Smoothing Implementation:

  • Real Label Smoothing: 0.9 instead of 1.0 (prevents overconfidence)
  • Fake Label Smoothing: 0.1 instead of 0.0 (adds uncertainty)
  • Discriminator Regularization: Prevents premature convergence to perfect classification

Noise Injection Strategy:

  • Input Noise: Gaussian noise (σ = 0.05) added to discriminator inputs
  • Regularization Effect: Makes discriminator task more challenging
  • Generalization: Improves discriminator robustness and prevents overfitting

Training Frequency Control:

  • Generator Training: 2x per discriminator update
  • Balanced Learning: Ensures generator receives adequate training opportunities
  • Dynamic Adaptation: Maintains competitive balance throughout training

Advanced Loss Formulation¶

Enhanced Generator Loss: Combines adversarial and classification objectives:

L_G = L_adversarial + L_classification
    = -E[log(D(G(z,c)))] + E[CE(c, D_class(G(z,c)))]

Enhanced Discriminator Loss: Incorporates label smoothing and dual objectives:

L_D = L_real + L_fake + L_class_real + L_class_fake

With label smoothing applied to authenticity targets.

Expected Performance Improvements¶

Qualitative Enhancements¶

Visual Quality:

  • Structural Consistency: Better preservation of letter shapes and proportions
  • Fine Detail Quality: Improved rendering of strokes, curves, and connection points
  • Class Fidelity: More accurate correspondence between generated images and target classes
  • Style Coherence: Consistent handwriting characteristics across different samples

Training Dynamics:

  • Stable Convergence: Smooth loss progression without oscillations
  • Balanced Competition: Healthy adversarial dynamics throughout training
  • Consistent Progress: Steady quality improvement over training epochs
  • Robustness: Reduced sensitivity to hyperparameter choices and random initialization

Quantitative Benchmarks¶

Comparison Metrics:

  • Loss Stability: Reduced variance in training loss trajectories
  • Convergence Speed: Faster achievement of target quality levels
  • Mode Coverage: Better representation of all letter classes
  • Training Balance: Optimal discriminator accuracy range (70-95%)

This enhanced implementation represents a comprehensive solution to the fundamental challenges in GAN training, specifically tailored for high-quality handwritten letter generation from our EMNIST dataset analysis.

In [19]:
# =============================================================================
# ENHANCED DCGAN 
# =============================================================================

# -----------------------------------------------------------------------------
# ENHANCED GENERATOR COMPONENTS
# -----------------------------------------------------------------------------

class ResidualBlock(tf.keras.layers.Layer):
    """Residual block for enhanced generator - improves gradient flow"""
    def __init__(self, filters, kernel_size=3, strides=1, **kwargs):
        super(ResidualBlock, self).__init__(**kwargs)
        self.filters = filters
        self.kernel_size = kernel_size
        self.strides = strides
        
        # Main path
        self.conv1 = tf.keras.layers.Conv2DTranspose(
            filters, kernel_size, strides=strides, padding='same', use_bias=False
        )
        self.bn1 = tf.keras.layers.BatchNormalization()
        self.relu1 = tf.keras.layers.ReLU()
        
        self.conv2 = tf.keras.layers.Conv2DTranspose(
            filters, kernel_size, strides=1, padding='same', use_bias=False
        )
        self.bn2 = tf.keras.layers.BatchNormalization()
        
        # Skip connection (if needed)
        self.skip_conv = None
        if strides != 1:
            self.skip_conv = tf.keras.layers.Conv2DTranspose(
                filters, 1, strides=strides, padding='same', use_bias=False
            )
            self.skip_bn = tf.keras.layers.BatchNormalization()
        
        self.final_relu = tf.keras.layers.ReLU()
    
    def call(self, inputs, training=None):
        # Main path
        x = self.conv1(inputs, training=training)
        x = self.bn1(x, training=training)
        x = self.relu1(x)
        
        x = self.conv2(x, training=training)
        x = self.bn2(x, training=training)
        
        # Skip connection
        if self.skip_conv is not None:
            skip = self.skip_conv(inputs, training=training)
            skip = self.skip_bn(skip, training=training)
        else:
            skip = inputs
        
        # Add skip connection and apply final activation
        x = x + skip
        return self.final_relu(x)

class SelfAttention(tf.keras.layers.Layer):
    """Self-attention mechanism for better feature learning"""
    def __init__(self, **kwargs):
        super(SelfAttention, self).__init__(**kwargs)
    
    def build(self, input_shape):
        self.channels = input_shape[-1]
        
        # Query, Key, Value convolutions
        self.query_conv = tf.keras.layers.Conv2D(self.channels // 8, 1, use_bias=False)
        self.key_conv = tf.keras.layers.Conv2D(self.channels // 8, 1, use_bias=False)
        self.value_conv = tf.keras.layers.Conv2D(self.channels, 1, use_bias=False)
        
        # Output convolution
        self.out_conv = tf.keras.layers.Conv2D(self.channels, 1, use_bias=False)
        
        # Learnable parameter for attention weight
        self.gamma = self.add_weight(name='gamma', shape=(), initializer='zeros', trainable=True)
    
    def call(self, inputs):
        batch_size, height, width, channels = tf.shape(inputs)[0], tf.shape(inputs)[1], tf.shape(inputs)[2], tf.shape(inputs)[3]
        
        # Generate query, key, value
        query = self.query_conv(inputs)
        key = self.key_conv(inputs)
        value = self.value_conv(inputs)
        
        # Reshape for matrix multiplication
        query = tf.reshape(query, [batch_size, height * width, channels // 8])
        key = tf.reshape(key, [batch_size, height * width, channels // 8])
        value = tf.reshape(value, [batch_size, height * width, channels])
        
        # Compute attention
        attention = tf.nn.softmax(tf.matmul(query, key, transpose_b=True))
        out = tf.matmul(attention, value)
        
        # Reshape back
        out = tf.reshape(out, [batch_size, height, width, channels])
        out = self.out_conv(out)
        
        # Apply attention with learnable weight
        return inputs + self.gamma * out

def build_enhanced_generator(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """Build enhanced generator with residual blocks and self-attention"""
    
    # Noise input
    noise_input = tf.keras.layers.Input(shape=(latent_dim,), name='noise_input')
    
    # Label input for conditional generation
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Embed labels
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate noise and label
    x = tf.keras.layers.Concatenate()([noise_input, label_embedding])
    
    # Initial dense layer
    x = tf.keras.layers.Dense(7 * 7 * 256, use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Reshape to start convolution
    x = tf.keras.layers.Reshape((7, 7, 256))(x)
    
    # First residual block (7x7x256 -> 14x14x128)
    x = ResidualBlock(128, strides=2, name='res_block_1')(x)
    
    # Add self-attention at intermediate resolution
    x = SelfAttention(name='self_attention')(x)
    
    # Second residual block (14x14x128 -> 28x28x64)
    x = ResidualBlock(64, strides=2, name='res_block_2')(x)
    
    # Final convolution to get single channel
    x = tf.keras.layers.Conv2DTranspose(
        1, 7, strides=1, padding='same', activation='tanh', name='output_conv'
    )(x)
    
    model = tf.keras.Model(
        inputs=[noise_input, label_input],
        outputs=x,
        name='enhanced_generator'
    )
    
    return model

# -----------------------------------------------------------------------------
# ENHANCED DISCRIMINATOR COMPONENTS
# -----------------------------------------------------------------------------

class SpectralNormalization(tf.keras.layers.Wrapper):
    """Spectral Normalization layer for improved training stability"""
    def __init__(self, layer, **kwargs):
        super(SpectralNormalization, self).__init__(layer, **kwargs)
        self.power_iterations = 1
    
    def build(self, input_shape):
        super(SpectralNormalization, self).build(input_shape)
        
        # Get weight matrix
        if hasattr(self.layer, 'kernel'):
            self.w = self.layer.kernel
        else:
            raise ValueError('Layer does not have kernel weights')
        
        # Initialize u vector for power iteration
        w_shape = self.w.shape.as_list()
        self.u = self.add_weight(
            shape=(1, w_shape[-1]),
            initializer='random_normal',
            name='u',
            trainable=False
        )
    
    def call(self, inputs, training=None):
        # Perform spectral normalization
        w_reshaped = tf.reshape(self.w, [-1, self.w.shape[-1]])
        
        # Power iteration
        u = self.u
        for _ in range(self.power_iterations):
            v = tf.nn.l2_normalize(tf.matmul(u, w_reshaped, transpose_b=True))
            u = tf.nn.l2_normalize(tf.matmul(v, w_reshaped))
        
        # Update u
        self.u.assign(u)
        
        # Compute spectral norm
        sigma = tf.matmul(tf.matmul(v, w_reshaped), u, transpose_b=True)
        
        # Normalize weights
        w_normalized = self.w / sigma
        
        # Temporarily replace weights
        original_w = self.layer.kernel
        self.layer.kernel = w_normalized
        
        # Call the layer
        output = self.layer(inputs, training=training)
        
        # Restore original weights
        self.layer.kernel = original_w
        
        return output

def build_enhanced_discriminator(img_height=28, img_width=28, num_classes=16):
    """Build enhanced discriminator with spectral normalization"""
    
    # Image input
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1), name='img_input')
    
    # Label input
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process image
    x = img_input
    
    # First conv block (28x28x1 -> 14x14x64)
    x = SpectralNormalization(
        tf.keras.layers.Conv2D(64, 4, strides=2, padding='same')
    )(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Second conv block (14x14x64 -> 7x7x128)
    x = SpectralNormalization(
        tf.keras.layers.Conv2D(128, 4, strides=2, padding='same')
    )(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Third conv block (7x7x128 -> 4x4x256)
    x = SpectralNormalization(
        tf.keras.layers.Conv2D(256, 4, strides=2, padding='same')
    )(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Flatten for dense layers
    x = tf.keras.layers.Flatten()(x)
    
    # Process label
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate image features and label
    x = tf.keras.layers.Concatenate()([x, label_embedding])
    
    # Dense layers
    x = tf.keras.layers.Dense(512)(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    # Output layers
    validity = tf.keras.layers.Dense(1, activation='sigmoid', name='validity')(x)
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(x)
    
    model = tf.keras.Model(
        inputs=[img_input, label_input],
        outputs=[validity, label_pred],
        name='enhanced_discriminator'
    )
    
    return model

# -----------------------------------------------------------------------------
# 3. BUILD ENHANCED MODELS
# -----------------------------------------------------------------------------

# Build enhanced models with correct number of classes
enhanced_generator = build_enhanced_generator(
    latent_dim=100, 
    num_classes=num_classes,  # 16 classes
    img_height=28, 
    img_width=28
)

enhanced_discriminator = build_enhanced_discriminator(
    img_height=28, 
    img_width=28, 
    num_classes=num_classes  # 16 classes
)
In [20]:
# =============================================================================
# BALANCED GAN TRAINER & VISUALIZATION
# =============================================================================

class BalancedGANTrainer:
    """Balanced GAN training pipeline that prevents discriminator dominance"""
    
    def __init__(self, generator, discriminator, latent_dim=100, num_classes=16):
        self.generator = generator
        self.discriminator = discriminator
        self.latent_dim = latent_dim
        self.num_classes = num_classes
        
        # Balanced learning rates: slower discriminator, faster generator
        self.gen_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5)
        self.disc_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001, beta_1=0.5)  # Slower
        
        # Loss functions
        self.bce_loss = tf.keras.losses.BinaryCrossentropy(from_logits=False)
        self.categorical_loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
        
        # Label smoothing parameters
        self.real_label_smoothing = 0.9  # Instead of 1.0
        self.fake_label_smoothing = 0.1  # Instead of 0.0
        
        # Noise injection parameters
        self.noise_std = 0.05  # Add noise to discriminator inputs
        
        # Training frequency (train generator more often)
        self.gen_train_freq = 2  # Train generator 2 times per discriminator training
        
        # Metrics for tracking
        self.gen_loss_metric = tf.keras.metrics.Mean(name='gen_loss')
        self.disc_loss_metric = tf.keras.metrics.Mean(name='disc_loss')
        self.acc_metric = tf.keras.metrics.SparseCategoricalAccuracy(name='label_accuracy')
        
        # Training history
        self.history = {
            'gen_loss': [],
            'disc_loss': [],
            'label_accuracy': [],
            'epoch': []
        }
    
    def add_noise_to_images(self, images):
        """Add noise to images to make discriminator task harder"""
        noise = tf.random.normal(tf.shape(images), stddev=self.noise_std)
        return tf.clip_by_value(images + noise, -1.0, 1.0)
    
    @tf.function
    def train_discriminator(self, real_images, real_labels, batch_size):
        """Train discriminator with label smoothing and noise injection"""
        # Generate fake images
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], 0, self.num_classes, dtype=tf.int32)
        fake_images = self.generator([noise, fake_labels], training=True)
        
        # Add noise to both real and fake images
        real_images_noisy = self.add_noise_to_images(real_images)
        fake_images_noisy = self.add_noise_to_images(fake_images)
        
        with tf.GradientTape() as tape:
            # Real images with label smoothing
            real_validity, real_label_pred = self.discriminator([real_images_noisy, real_labels], training=True)
            
            # Fake images with label smoothing
            fake_validity, fake_label_pred = self.discriminator([fake_images_noisy, fake_labels], training=True)
            
            # Adversarial losses with label smoothing
            real_loss = self.bce_loss(
                tf.ones_like(real_validity) * self.real_label_smoothing, 
                real_validity
            )
            fake_loss = self.bce_loss(
                tf.ones_like(fake_validity) * self.fake_label_smoothing, 
                fake_validity
            )
            
            # Label classification losses
            real_label_loss = self.categorical_loss(real_labels, real_label_pred)
            fake_label_loss = self.categorical_loss(fake_labels, fake_label_pred)
            
            # Total discriminator loss
            disc_loss = (real_loss + fake_loss) / 2 + (real_label_loss + fake_label_loss) / 2
        
        # Update discriminator
        gradients = tape.gradient(disc_loss, self.discriminator.trainable_variables)
        self.disc_optimizer.apply_gradients(zip(gradients, self.discriminator.trainable_variables))
        
        # Update metrics
        self.disc_loss_metric.update_state(disc_loss)
        self.acc_metric.update_state(real_labels, real_label_pred)
        
        return disc_loss
    
    @tf.function
    def train_generator(self, batch_size):
        """Train generator with stronger loss to compete with discriminator"""
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], 0, self.num_classes, dtype=tf.int32)
        
        with tf.GradientTape() as tape:
            fake_images = self.generator([noise, fake_labels], training=True)
            fake_validity, fake_label_pred = self.discriminator([fake_images, fake_labels], training=True)
            
            # Generator wants discriminator to classify fake images as real (no smoothing here)
            adversarial_loss = self.bce_loss(tf.ones_like(fake_validity), fake_validity)
            
            # Generator wants correct label classification
            label_loss = self.categorical_loss(fake_labels, fake_label_pred)
            
            # Total generator loss with stronger adversarial component
            gen_loss = adversarial_loss * 1.5 + label_loss  # Boost adversarial loss
        
        # Update generator
        gradients = tape.gradient(gen_loss, self.generator.trainable_variables)
        self.gen_optimizer.apply_gradients(zip(gradients, self.generator.trainable_variables))
        
        # Update metrics
        self.gen_loss_metric.update_state(gen_loss)
        
        return gen_loss
    
    def train_epoch(self, dataset, epoch, steps_per_epoch):
        """Train for one epoch with balanced training strategy"""
        print(f"\nEpoch {epoch + 1}")
        print("-" * 60)
        
        # Reset metrics
        self.gen_loss_metric.reset_states()
        self.disc_loss_metric.reset_states()
        self.acc_metric.reset_states()
        
        # Training loop with balanced strategy
        for step, (real_images, real_labels) in enumerate(dataset.take(steps_per_epoch)):
            batch_size = tf.shape(real_images)[0]
            
            # Train discriminator less frequently to prevent it from being too strong
            if step % 2 == 0:  # Train discriminator every other step
                disc_loss = self.train_discriminator(real_images, real_labels, batch_size)
            
            # Train generator more frequently
            for _ in range(self.gen_train_freq):
                gen_loss = self.train_generator(batch_size)
            
            # Print progress
            if step % 100 == 0:
                print(f"Step {step:4d}/{steps_per_epoch} - "
                      f"Gen Loss: {self.gen_loss_metric.result():.4f}, "
                      f"Disc Loss: {self.disc_loss_metric.result():.4f}, "
                      f"Accuracy: {self.acc_metric.result():.4f}")
        
        # Store epoch results
        self.history['gen_loss'].append(float(self.gen_loss_metric.result()))
        self.history['disc_loss'].append(float(self.disc_loss_metric.result()))
        self.history['label_accuracy'].append(float(self.acc_metric.result()))
        self.history['epoch'].append(epoch)

        # Check if discriminator is still too strong
        acc = self.acc_metric.result()
        if acc > 0.95:
            print("Discriminator accuracy > 95% - consider further balancing")
        elif acc < 0.70:
            print("Discriminator accuracy < 70% - discriminator might be too weak")
        else:
            print("Discriminator accuracy in good range (70-95%)")

def display_generated_samples_grid(generator, class_to_letter, epoch=None, samples_per_class=6):
    """Generate and display multiple samples per class in grid format"""
    print(f"Generating {samples_per_class} samples per class for all {len(class_to_letter)} letter classes...")
    
    # Generate samples for each class
    all_images = []
    all_labels = []
    
    for class_idx in sorted(class_to_letter.keys()):
        # Generate multiple samples for this class
        noise = tf.random.normal([samples_per_class, 100])
        labels = tf.fill([samples_per_class], class_idx)
        generated_images = generator([noise, labels], training=False)
        
        all_images.append(generated_images)
        all_labels.extend([class_idx] * samples_per_class)
    
    # Concatenate all generated images
    all_generated = tf.concat(all_images, axis=0)
    
    # Calculate grid dimensions
    n_classes = len(class_to_letter)
    n_cols = min(13, n_classes)  # Max 13 columns to fit screen
    n_rows_per_class = samples_per_class
    total_rows = n_rows_per_class * ((n_classes + n_cols - 1) // n_cols)
    
    # Create the plot with proper sizing
    fig_width = max(15, n_cols * 1.2)
    fig_height = max(10, total_rows * 0.8)
    fig, axes = plt.subplots(total_rows, n_cols, figsize=(fig_width, fig_height))
    
    if epoch is not None:
        fig.suptitle(f'Generated Letter Samples - All Classes (Epoch {epoch})\\n{samples_per_class} samples per class', 
                     fontsize=16, fontweight='bold')
    else:
        fig.suptitle(f'Generated Letter Samples - All Classes\\n{samples_per_class} samples per class', 
                     fontsize=16, fontweight='bold')
    
    # Handle axes indexing
    if total_rows == 1:
        axes = axes.reshape(1, -1)
    elif n_cols == 1:
        axes = axes.reshape(-1, 1)
    
    # Initialize all axes as empty
    for i in range(total_rows):
        for j in range(n_cols):
            axes[i, j].axis('off')
    
    # Fill the grid with generated images
    for class_idx_enum, class_idx in enumerate(sorted(class_to_letter.keys())):
        letter = class_to_letter[class_idx]
        
        # Calculate which "class column" this belongs to
        class_col = class_idx_enum % n_cols
        class_row_start = (class_idx_enum // n_cols) * samples_per_class
        
        # Display all samples for this class
        for sample_idx in range(samples_per_class):
            row = class_row_start + sample_idx
            col = class_col
            
            if row < total_rows and col < n_cols:
                ax = axes[row, col]
                
                # Get the corresponding image
                img_idx = class_idx_enum * samples_per_class + sample_idx
                img = all_generated[img_idx].numpy().squeeze()
                
                # Display image
                ax.imshow(img, cmap='gray', vmin=-1, vmax=1)
                ax.axis('off')
                
                # Add class label on first sample of each class
                if sample_idx == 0:
                    ax.set_title(f'{letter} (Class {class_idx})', fontsize=10, fontweight='bold')
    
    plt.tight_layout()
    plt.show()
    
    return all_generated, all_labels


balanced_enhanced_trainer = BalancedGANTrainer(
    generator=enhanced_generator,
    discriminator=enhanced_discriminator,
    latent_dim=100,
    num_classes=num_classes
)
In [21]:
# =============================================================================
# ENHANCED DCGAN TRAINING - COMPLETE 50 EPOCH TRAINING
# =============================================================================

# Test visualization first
print("Testing visualization with current enhanced models...")
test_images, test_labels = display_generated_samples_grid(
    enhanced_generator, class_to_letter, samples_per_class=6
)

# Training configuration
NUM_EPOCHS = 50

start_time = time.time()

for epoch in range(NUM_EPOCHS):
    epoch_start = time.time()
    
    # Train for one epoch with balanced strategy
    balanced_enhanced_trainer.train_epoch(train_dataset, epoch, steps_per_epoch)
    
    # Display only on the final epoch
    if (epoch + 1) == NUM_EPOCHS:
        print(f"\nGenerating Grid of Samples at Final Epoch {epoch + 1}:")
        display_generated_samples_grid(enhanced_generator, class_to_letter, epoch + 1, samples_per_class=6)
    
    # Calculate and display epoch timing
    epoch_time = time.time() - epoch_start
    total_time = time.time() - start_time
    avg_time = total_time / (epoch + 1)
    eta = avg_time * (NUM_EPOCHS - epoch - 1)
    
total_training_time = time.time() - start_time


# Generate final display
print(f"\nFinal Generated Samples - All Letter Classes:")
final_images, final_labels = display_generated_samples_grid(
    enhanced_generator, class_to_letter, NUM_EPOCHS, samples_per_class=6
)

# Display comprehensive training summary
available_letters = sorted(class_to_letter.values())
missing_letters = [letter for letter in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' if letter not in available_letters]

# Plot training progress
if len(balanced_enhanced_trainer.history['gen_loss']) > 1:
    plt.figure(figsize=(15, 5))
    
    # Generator and Discriminator Loss
    plt.subplot(1, 3, 1)
    epochs = balanced_enhanced_trainer.history['epoch']
    plt.plot(epochs, balanced_enhanced_trainer.history['gen_loss'], label='Generator Loss', color='blue', linewidth=2)
    plt.plot(epochs, balanced_enhanced_trainer.history['disc_loss'], label='Discriminator Loss', color='red', linewidth=2)
    plt.title('Enhanced DCGAN - Training Losses', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Discriminator Accuracy
    plt.subplot(1, 3, 2)
    plt.plot(epochs, balanced_enhanced_trainer.history['label_accuracy'], label='Discriminator Accuracy', color='green', linewidth=2)
    plt.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='Upper limit (95%)')
    plt.axhline(y=0.70, color='orange', linestyle='--', alpha=0.7, label='Lower limit (70%)')
    plt.title('Enhanced DCGAN - Discriminator Accuracy', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Loss Difference
    plt.subplot(1, 3, 3)
    loss_diff = [g - d for g, d in zip(balanced_enhanced_trainer.history['gen_loss'], balanced_enhanced_trainer.history['disc_loss'])]
    plt.plot(epochs, loss_diff, label='Gen Loss - Disc Loss', color='purple', linewidth=2)
    plt.axhline(y=0, color='black', linestyle='-', alpha=0.5)
    plt.title('Enhanced DCGAN - Loss Balance', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss Difference')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
Testing visualization with current enhanced models...
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Epoch 1
------------------------------------------------------------
Step    0/597 - Gen Loss: 5.3705, Disc Loss: 4.8662, Accuracy: 0.0625
Step  100/597 - Gen Loss: 4.6713, Disc Loss: 3.8841, Accuracy: 0.1569
Step  200/597 - Gen Loss: 3.9121, Disc Loss: 3.2987, Accuracy: 0.2537
Step  300/597 - Gen Loss: 3.1612, Disc Loss: 2.7788, Accuracy: 0.3339
Step  400/597 - Gen Loss: 2.7231, Disc Loss: 2.4758, Accuracy: 0.3888
Step  500/597 - Gen Loss: 2.4469, Disc Loss: 2.2765, Accuracy: 0.4304
Discriminator accuracy < 70% - discriminator might be too weak

Epoch 2
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.2643, Disc Loss: 1.2957, Accuracy: 0.6406
Step  100/597 - Gen Loss: 1.2673, Disc Loss: 1.3416, Accuracy: 0.6520
Step  200/597 - Gen Loss: 1.2671, Disc Loss: 1.3237, Accuracy: 0.6609
Step  300/597 - Gen Loss: 1.2632, Disc Loss: 1.3082, Accuracy: 0.6727
Step  400/597 - Gen Loss: 1.2650, Disc Loss: 1.2890, Accuracy: 0.6809
Step  500/597 - Gen Loss: 1.2673, Disc Loss: 1.2701, Accuracy: 0.6893
Discriminator accuracy < 70% - discriminator might be too weak

Epoch 3
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.3010, Disc Loss: 1.0230, Accuracy: 0.7969
Step  100/597 - Gen Loss: 1.2846, Disc Loss: 1.1293, Accuracy: 0.7598
Step  200/597 - Gen Loss: 1.2917, Disc Loss: 1.1186, Accuracy: 0.7696
Step  300/597 - Gen Loss: 1.2915, Disc Loss: 1.1114, Accuracy: 0.7747
Step  400/597 - Gen Loss: 1.2929, Disc Loss: 1.0991, Accuracy: 0.7866
Step  500/597 - Gen Loss: 1.2893, Disc Loss: 1.0904, Accuracy: 0.7936
Discriminator accuracy in good range (70-95%)

Epoch 4
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.2726, Disc Loss: 1.0860, Accuracy: 0.8281
Step  100/597 - Gen Loss: 1.2611, Disc Loss: 0.9966, Accuracy: 0.8456
Step  200/597 - Gen Loss: 1.2625, Disc Loss: 0.9846, Accuracy: 0.8543
Step  300/597 - Gen Loss: 1.2532, Disc Loss: 0.9714, Accuracy: 0.8607
Step  400/597 - Gen Loss: 1.2542, Disc Loss: 0.9660, Accuracy: 0.8658
Step  500/597 - Gen Loss: 1.2502, Disc Loss: 0.9565, Accuracy: 0.8720
Discriminator accuracy in good range (70-95%)

Epoch 5
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1247, Disc Loss: 0.8879, Accuracy: 0.9219
Step  100/597 - Gen Loss: 1.2221, Disc Loss: 0.8865, Accuracy: 0.9182
Step  200/597 - Gen Loss: 1.2170, Disc Loss: 0.8829, Accuracy: 0.9183
Step  300/597 - Gen Loss: 1.2154, Disc Loss: 0.8759, Accuracy: 0.9229
Step  400/597 - Gen Loss: 1.2146, Disc Loss: 0.8732, Accuracy: 0.9244
Step  500/597 - Gen Loss: 1.2093, Disc Loss: 0.8669, Accuracy: 0.9278
Discriminator accuracy in good range (70-95%)

Epoch 6
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.2253, Disc Loss: 0.8272, Accuracy: 0.9375
Step  100/597 - Gen Loss: 1.1802, Disc Loss: 0.8378, Accuracy: 0.9482
Step  200/597 - Gen Loss: 1.1702, Disc Loss: 0.8371, Accuracy: 0.9472
Step  300/597 - Gen Loss: 1.1654, Disc Loss: 0.8301, Accuracy: 0.9500
Step  400/597 - Gen Loss: 1.1641, Disc Loss: 0.8238, Accuracy: 0.9520
Step  500/597 - Gen Loss: 1.1620, Disc Loss: 0.8195, Accuracy: 0.9547
Discriminator accuracy > 95% - consider further balancing

Epoch 7
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1750, Disc Loss: 0.7238, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.1405, Disc Loss: 0.7797, Accuracy: 0.9715
Step  200/597 - Gen Loss: 1.1431, Disc Loss: 0.7810, Accuracy: 0.9711
Step  300/597 - Gen Loss: 1.1384, Disc Loss: 0.7781, Accuracy: 0.9725
Step  400/597 - Gen Loss: 1.1377, Disc Loss: 0.7773, Accuracy: 0.9734
Step  500/597 - Gen Loss: 1.1347, Disc Loss: 0.7759, Accuracy: 0.9747
Discriminator accuracy > 95% - consider further balancing

Epoch 8
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.2012, Disc Loss: 0.7469, Accuracy: 0.9844
Step  100/597 - Gen Loss: 1.1195, Disc Loss: 0.7566, Accuracy: 0.9856
Step  200/597 - Gen Loss: 1.1184, Disc Loss: 0.7601, Accuracy: 0.9839
Step  300/597 - Gen Loss: 1.1173, Disc Loss: 0.7579, Accuracy: 0.9846
Step  400/597 - Gen Loss: 1.1128, Disc Loss: 0.7556, Accuracy: 0.9843
Step  500/597 - Gen Loss: 1.1115, Disc Loss: 0.7531, Accuracy: 0.9852
Discriminator accuracy > 95% - consider further balancing

Epoch 9
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1017, Disc Loss: 0.7205, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0952, Disc Loss: 0.7417, Accuracy: 0.9896
Step  200/597 - Gen Loss: 1.0976, Disc Loss: 0.7400, Accuracy: 0.9901
Step  300/597 - Gen Loss: 1.0964, Disc Loss: 0.7371, Accuracy: 0.9911
Step  400/597 - Gen Loss: 1.0944, Disc Loss: 0.7361, Accuracy: 0.9907
Step  500/597 - Gen Loss: 1.0914, Disc Loss: 0.7352, Accuracy: 0.9911
Discriminator accuracy > 95% - consider further balancing

Epoch 10
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0304, Disc Loss: 0.6985, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0767, Disc Loss: 0.7306, Accuracy: 0.9923
Step  200/597 - Gen Loss: 1.0789, Disc Loss: 0.7300, Accuracy: 0.9933
Step  300/597 - Gen Loss: 1.0802, Disc Loss: 0.7293, Accuracy: 0.9934
Step  400/597 - Gen Loss: 1.0803, Disc Loss: 0.7285, Accuracy: 0.9932
Step  500/597 - Gen Loss: 1.0783, Disc Loss: 0.7270, Accuracy: 0.9932
Discriminator accuracy > 95% - consider further balancing

Epoch 11
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0526, Disc Loss: 0.7288, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0714, Disc Loss: 0.7229, Accuracy: 0.9957
Step  200/597 - Gen Loss: 1.0666, Disc Loss: 0.7225, Accuracy: 0.9947
Step  300/597 - Gen Loss: 1.0681, Disc Loss: 0.7220, Accuracy: 0.9954
Step  400/597 - Gen Loss: 1.0685, Disc Loss: 0.7210, Accuracy: 0.9958
Step  500/597 - Gen Loss: 1.0671, Disc Loss: 0.7201, Accuracy: 0.9956
Discriminator accuracy > 95% - consider further balancing

Epoch 12
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0621, Disc Loss: 0.7100, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0654, Disc Loss: 0.7138, Accuracy: 0.9942
Step  200/597 - Gen Loss: 1.0681, Disc Loss: 0.7143, Accuracy: 0.9955
Step  300/597 - Gen Loss: 1.0673, Disc Loss: 0.7139, Accuracy: 0.9959
Step  400/597 - Gen Loss: 1.0689, Disc Loss: 0.7137, Accuracy: 0.9961
Step  500/597 - Gen Loss: 1.0670, Disc Loss: 0.7131, Accuracy: 0.9963
Discriminator accuracy > 95% - consider further balancing

Epoch 13
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0717, Disc Loss: 0.7038, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0655, Disc Loss: 0.7077, Accuracy: 0.9988
Step  200/597 - Gen Loss: 1.0658, Disc Loss: 0.7104, Accuracy: 0.9972
Step  300/597 - Gen Loss: 1.0652, Disc Loss: 0.7106, Accuracy: 0.9974
Step  400/597 - Gen Loss: 1.0635, Disc Loss: 0.7097, Accuracy: 0.9976
Step  500/597 - Gen Loss: 1.0621, Disc Loss: 0.7094, Accuracy: 0.9978
Discriminator accuracy > 95% - consider further balancing

Epoch 14
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0717, Disc Loss: 0.7057, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0562, Disc Loss: 0.7077, Accuracy: 0.9975
Step  200/597 - Gen Loss: 1.0575, Disc Loss: 0.7071, Accuracy: 0.9981
Step  300/597 - Gen Loss: 1.0545, Disc Loss: 0.7066, Accuracy: 0.9980
Step  400/597 - Gen Loss: 1.0547, Disc Loss: 0.7067, Accuracy: 0.9982
Step  500/597 - Gen Loss: 1.0538, Disc Loss: 0.7066, Accuracy: 0.9981
Discriminator accuracy > 95% - consider further balancing

Epoch 15
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0361, Disc Loss: 0.7016, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0508, Disc Loss: 0.7048, Accuracy: 0.9994
Step  200/597 - Gen Loss: 1.0515, Disc Loss: 0.7055, Accuracy: 0.9992
Step  300/597 - Gen Loss: 1.0510, Disc Loss: 0.7057, Accuracy: 0.9987
Step  400/597 - Gen Loss: 1.0520, Disc Loss: 0.7053, Accuracy: 0.9985
Step  500/597 - Gen Loss: 1.0509, Disc Loss: 0.7048, Accuracy: 0.9987
Discriminator accuracy > 95% - consider further balancing

Epoch 16
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0673, Disc Loss: 0.7062, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0546, Disc Loss: 0.7046, Accuracy: 0.9997
Step  200/597 - Gen Loss: 1.0532, Disc Loss: 0.7028, Accuracy: 0.9994
Step  300/597 - Gen Loss: 1.0521, Disc Loss: 0.7027, Accuracy: 0.9994
Step  400/597 - Gen Loss: 1.0516, Disc Loss: 0.7028, Accuracy: 0.9992
Step  500/597 - Gen Loss: 1.0513, Disc Loss: 0.7025, Accuracy: 0.9992
Discriminator accuracy > 95% - consider further balancing

Epoch 17
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0353, Disc Loss: 0.7014, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0431, Disc Loss: 0.7019, Accuracy: 0.9997
Step  200/597 - Gen Loss: 1.0447, Disc Loss: 0.7017, Accuracy: 0.9997
Step  300/597 - Gen Loss: 1.0464, Disc Loss: 0.7012, Accuracy: 0.9995
Step  400/597 - Gen Loss: 1.0462, Disc Loss: 0.7009, Accuracy: 0.9994
Step  500/597 - Gen Loss: 1.0457, Disc Loss: 0.7003, Accuracy: 0.9995
Discriminator accuracy > 95% - consider further balancing

Epoch 18
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0646, Disc Loss: 0.7028, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0435, Disc Loss: 0.7022, Accuracy: 0.9994
Step  200/597 - Gen Loss: 1.0424, Disc Loss: 0.7011, Accuracy: 0.9997
Step  300/597 - Gen Loss: 1.0440, Disc Loss: 0.7006, Accuracy: 0.9997
Step  400/597 - Gen Loss: 1.0442, Disc Loss: 0.7006, Accuracy: 0.9998
Step  500/597 - Gen Loss: 1.0439, Disc Loss: 0.7004, Accuracy: 0.9998
Discriminator accuracy > 95% - consider further balancing

Epoch 19
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0519, Disc Loss: 0.7129, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0439, Disc Loss: 0.6986, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0436, Disc Loss: 0.6996, Accuracy: 0.9997
Step  300/597 - Gen Loss: 1.0415, Disc Loss: 0.7000, Accuracy: 0.9994
Step  400/597 - Gen Loss: 1.0412, Disc Loss: 0.6998, Accuracy: 0.9995
Step  500/597 - Gen Loss: 1.0431, Disc Loss: 0.6995, Accuracy: 0.9996
Discriminator accuracy > 95% - consider further balancing

Epoch 20
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0527, Disc Loss: 0.6998, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0412, Disc Loss: 0.6976, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0435, Disc Loss: 0.6983, Accuracy: 0.9998
Step  300/597 - Gen Loss: 1.0436, Disc Loss: 0.6984, Accuracy: 0.9998
Step  400/597 - Gen Loss: 1.0438, Disc Loss: 0.6983, Accuracy: 0.9998
Step  500/597 - Gen Loss: 1.0441, Disc Loss: 0.6982, Accuracy: 0.9998
Discriminator accuracy > 95% - consider further balancing

Epoch 21
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0196, Disc Loss: 0.6986, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0444, Disc Loss: 0.6968, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0415, Disc Loss: 0.6964, Accuracy: 0.9997
Step  300/597 - Gen Loss: 1.0409, Disc Loss: 0.6970, Accuracy: 0.9997
Step  400/597 - Gen Loss: 1.0413, Disc Loss: 0.6968, Accuracy: 0.9997
Step  500/597 - Gen Loss: 1.0408, Disc Loss: 0.6969, Accuracy: 0.9997
Discriminator accuracy > 95% - consider further balancing

Epoch 22
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0265, Disc Loss: 0.7005, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0384, Disc Loss: 0.6961, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0396, Disc Loss: 0.6962, Accuracy: 0.9998
Step  300/597 - Gen Loss: 1.0391, Disc Loss: 0.6966, Accuracy: 0.9998
Step  400/597 - Gen Loss: 1.0385, Disc Loss: 0.6968, Accuracy: 0.9998
Step  500/597 - Gen Loss: 1.0392, Disc Loss: 0.6969, Accuracy: 0.9998
Discriminator accuracy > 95% - consider further balancing

Epoch 23
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0718, Disc Loss: 0.6942, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0433, Disc Loss: 0.6962, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0419, Disc Loss: 0.6962, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0400, Disc Loss: 0.6965, Accuracy: 0.9999
Step  400/597 - Gen Loss: 1.0396, Disc Loss: 0.6967, Accuracy: 0.9998
Step  500/597 - Gen Loss: 1.0389, Disc Loss: 0.6967, Accuracy: 0.9999
Discriminator accuracy > 95% - consider further balancing

Epoch 24
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0402, Disc Loss: 0.6985, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0394, Disc Loss: 0.6961, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0396, Disc Loss: 0.6961, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0389, Disc Loss: 0.6962, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0389, Disc Loss: 0.6963, Accuracy: 0.9999
Step  500/597 - Gen Loss: 1.0389, Disc Loss: 0.6963, Accuracy: 0.9999
Discriminator accuracy > 95% - consider further balancing

Epoch 25
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0536, Disc Loss: 0.6964, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0454, Disc Loss: 0.6958, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0434, Disc Loss: 0.6964, Accuracy: 0.9997
Step  300/597 - Gen Loss: 1.0424, Disc Loss: 0.6961, Accuracy: 0.9998
Step  400/597 - Gen Loss: 1.0416, Disc Loss: 0.6959, Accuracy: 0.9998
Step  500/597 - Gen Loss: 1.0412, Disc Loss: 0.6959, Accuracy: 0.9999
Discriminator accuracy > 95% - consider further balancing

Epoch 26
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0490, Disc Loss: 0.6999, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0371, Disc Loss: 0.6955, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0373, Disc Loss: 0.6954, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0376, Disc Loss: 0.6958, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0376, Disc Loss: 0.6958, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0381, Disc Loss: 0.6956, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 27
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0314, Disc Loss: 0.6988, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0383, Disc Loss: 0.6949, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0383, Disc Loss: 0.6950, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0390, Disc Loss: 0.6947, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0385, Disc Loss: 0.6948, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0382, Disc Loss: 0.6947, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 28
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0408, Disc Loss: 0.6919, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0369, Disc Loss: 0.6951, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0366, Disc Loss: 0.6952, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0363, Disc Loss: 0.6952, Accuracy: 0.9999
Step  400/597 - Gen Loss: 1.0362, Disc Loss: 0.6953, Accuracy: 0.9998
Step  500/597 - Gen Loss: 1.0362, Disc Loss: 0.6952, Accuracy: 0.9999
Discriminator accuracy > 95% - consider further balancing

Epoch 29
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0469, Disc Loss: 0.6961, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0384, Disc Loss: 0.6956, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0376, Disc Loss: 0.6958, Accuracy: 0.9994
Step  300/597 - Gen Loss: 1.0396, Disc Loss: 0.6956, Accuracy: 0.9996
Step  400/597 - Gen Loss: 1.0395, Disc Loss: 0.6954, Accuracy: 0.9997
Step  500/597 - Gen Loss: 1.0390, Disc Loss: 0.6954, Accuracy: 0.9998
Discriminator accuracy > 95% - consider further balancing

Epoch 30
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0403, Disc Loss: 0.6909, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0407, Disc Loss: 0.6949, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0377, Disc Loss: 0.6946, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0385, Disc Loss: 0.6946, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0390, Disc Loss: 0.6946, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0387, Disc Loss: 0.6946, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 31
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0303, Disc Loss: 0.6999, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0371, Disc Loss: 0.6941, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0386, Disc Loss: 0.6944, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0382, Disc Loss: 0.6942, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0382, Disc Loss: 0.6944, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0380, Disc Loss: 0.6946, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 32
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0376, Disc Loss: 0.6987, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0400, Disc Loss: 0.6948, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0403, Disc Loss: 0.6948, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0400, Disc Loss: 0.6947, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0398, Disc Loss: 0.6946, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0394, Disc Loss: 0.6944, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 33
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0348, Disc Loss: 0.6958, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0382, Disc Loss: 0.6950, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0377, Disc Loss: 0.6946, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0377, Disc Loss: 0.6943, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0374, Disc Loss: 0.6942, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0368, Disc Loss: 0.6942, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 34
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0542, Disc Loss: 0.6898, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0390, Disc Loss: 0.6946, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0389, Disc Loss: 0.6944, Accuracy: 0.9998
Step  300/597 - Gen Loss: 1.0377, Disc Loss: 0.6945, Accuracy: 0.9999
Step  400/597 - Gen Loss: 1.0380, Disc Loss: 0.6945, Accuracy: 0.9998
Step  500/597 - Gen Loss: 1.0381, Disc Loss: 0.6945, Accuracy: 0.9999
Discriminator accuracy > 95% - consider further balancing

Epoch 35
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0477, Disc Loss: 0.6901, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0409, Disc Loss: 0.6941, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0391, Disc Loss: 0.6943, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0389, Disc Loss: 0.6946, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0381, Disc Loss: 0.6946, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0379, Disc Loss: 0.6945, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 36
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0344, Disc Loss: 0.6949, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0416, Disc Loss: 0.6938, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0397, Disc Loss: 0.6943, Accuracy: 0.9998
Step  300/597 - Gen Loss: 1.0394, Disc Loss: 0.6945, Accuracy: 0.9999
Step  400/597 - Gen Loss: 1.0391, Disc Loss: 0.6944, Accuracy: 0.9999
Step  500/597 - Gen Loss: 1.0384, Disc Loss: 0.6944, Accuracy: 0.9999
Discriminator accuracy > 95% - consider further balancing

Epoch 37
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0308, Disc Loss: 0.6962, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0370, Disc Loss: 0.6943, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0374, Disc Loss: 0.6944, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0378, Disc Loss: 0.6943, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0388, Disc Loss: 0.6943, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0385, Disc Loss: 0.6943, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 38
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0428, Disc Loss: 0.6954, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0394, Disc Loss: 0.6945, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0379, Disc Loss: 0.6942, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0378, Disc Loss: 0.6942, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0380, Disc Loss: 0.6940, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0375, Disc Loss: 0.6940, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 39
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0369, Disc Loss: 0.6946, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0391, Disc Loss: 0.6942, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0375, Disc Loss: 0.6944, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0384, Disc Loss: 0.6942, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0382, Disc Loss: 0.6942, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0382, Disc Loss: 0.6942, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 40
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0430, Disc Loss: 0.6961, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0396, Disc Loss: 0.6943, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0374, Disc Loss: 0.6945, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0373, Disc Loss: 0.6942, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0372, Disc Loss: 0.6941, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0371, Disc Loss: 0.6942, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 41
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0425, Disc Loss: 0.6948, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0393, Disc Loss: 0.6943, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0386, Disc Loss: 0.6941, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0379, Disc Loss: 0.6941, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0376, Disc Loss: 0.6941, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0382, Disc Loss: 0.6942, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 42
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0422, Disc Loss: 0.6933, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0396, Disc Loss: 0.6944, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0373, Disc Loss: 0.6941, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0383, Disc Loss: 0.6942, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0385, Disc Loss: 0.6942, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0383, Disc Loss: 0.6941, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 43
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0482, Disc Loss: 0.6915, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0398, Disc Loss: 0.6939, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0381, Disc Loss: 0.6940, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0386, Disc Loss: 0.6940, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0383, Disc Loss: 0.6940, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0381, Disc Loss: 0.6940, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 44
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0555, Disc Loss: 0.6914, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0367, Disc Loss: 0.6942, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0384, Disc Loss: 0.6942, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0384, Disc Loss: 0.6942, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0382, Disc Loss: 0.6941, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0373, Disc Loss: 0.6941, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 45
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0300, Disc Loss: 0.6946, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0341, Disc Loss: 0.6941, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0356, Disc Loss: 0.6941, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0366, Disc Loss: 0.6939, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0370, Disc Loss: 0.6940, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0373, Disc Loss: 0.6940, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 46
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0367, Disc Loss: 0.6928, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0391, Disc Loss: 0.6937, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0385, Disc Loss: 0.6938, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0375, Disc Loss: 0.6938, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0376, Disc Loss: 0.6939, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0371, Disc Loss: 0.6939, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 47
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0415, Disc Loss: 0.6920, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0351, Disc Loss: 0.6943, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0362, Disc Loss: 0.6942, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0362, Disc Loss: 0.6941, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0368, Disc Loss: 0.6941, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0373, Disc Loss: 0.6941, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 48
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0347, Disc Loss: 0.6917, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0378, Disc Loss: 0.6939, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0374, Disc Loss: 0.6941, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0377, Disc Loss: 0.6939, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0373, Disc Loss: 0.6940, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0372, Disc Loss: 0.6940, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 49
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0373, Disc Loss: 0.6911, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0376, Disc Loss: 0.6942, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0373, Disc Loss: 0.6939, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0372, Disc Loss: 0.6940, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0367, Disc Loss: 0.6940, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0371, Disc Loss: 0.6940, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Epoch 50
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0386, Disc Loss: 0.6944, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.0358, Disc Loss: 0.6939, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.0376, Disc Loss: 0.6941, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0371, Disc Loss: 0.6941, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0372, Disc Loss: 0.6941, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0372, Disc Loss: 0.6940, Accuracy: 1.0000
Discriminator accuracy > 95% - consider further balancing

Generating Grid of Samples at Final Epoch 50:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Final Generated Samples - All Letter Classes:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
No description has been provided for this image

Observation:¶

  • Generated Samples: Letters are sharper and more consistent than baseline; however, some (e.g., G, Q, O) remain distorted, and several classes (T, X, Z) show repetitive patterns.
  • Training Losses: Both generator and discriminator losses converge quickly to low values, indicating stable training but possible early saturation.
  • Discriminator Accuracy: Stays near 100% after a few epochs → discriminator still too strong.
  • Loss Balance: Small, steady gap between losses suggests improved stability over baseline but still biased toward the discriminator.

WGAN Training Implementation¶

This section implements Baseline and Enhanced WGAN training with the same layout and metrics as the DCGAN:

WGAN Architecture:¶

  • Baseline WGAN: Standard WGAN-GP with gradient penalty
  • Enhanced WGAN: Improved architecture with more capacity
  • Wasserstein Loss: Earth Mover's distance for better training stability

Training Features:¶

  • Gradient Penalty: Enforces Lipschitz constraint (λ=10.0)
  • Critic Training: 5 critic updates per generator update/
  • No Sigmoid: Critic outputs raw scores (no probability)
  • Conditional Generation: Uses class labels for controlled generation
In [26]:
# =============================================================================
# WGAN (WASSERSTEIN GAN) IMPLEMENTATION
# =============================================================================

# Baseline WGAN Critic
def build_baseline_wgan_critic(img_height=28, img_width=28, num_classes=16):
    """Build baseline WGAN critic (discriminator without sigmoid)"""
    
    # Image input
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1), name='img_input')
    
    # Label input  
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process image
    x = img_input
    
    # First conv block (28x28x1 -> 14x14x64)
    x = tf.keras.layers.Conv2D(64, 4, strides=2, padding='same')(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    
    # Second conv block (14x14x64 -> 7x7x128)
    x = tf.keras.layers.Conv2D(128, 4, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    
    # Third conv block (7x7x128 -> 4x4x256)
    x = tf.keras.layers.Conv2D(256, 4, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    
    # Flatten for dense layers
    x = tf.keras.layers.Flatten()(x)
    
    # Process label
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate image features and label
    x = tf.keras.layers.Concatenate()([x, label_embedding])
    
    # Dense layers
    x = tf.keras.layers.Dense(512)(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    
    # Output layer - NO SIGMOID for WGAN (raw scores)
    validity = tf.keras.layers.Dense(1, name='validity')(x)  # No activation!
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(x)
    
    model = tf.keras.Model(
        inputs=[img_input, label_input],
        outputs=[validity, label_pred],
        name='baseline_wgan_critic'
    )
    
    return model

# Baseline WGAN Generator (same as DCGAN baseline)
def build_baseline_wgan_generator(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """Build baseline WGAN generator"""
    
    # Noise input
    noise_input = tf.keras.layers.Input(shape=(latent_dim,), name='noise_input')
    
    # Label input
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process label
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate noise and label
    x = tf.keras.layers.Concatenate()([noise_input, label_embedding])
    
    # Dense layer to reshape
    x = tf.keras.layers.Dense(7 * 7 * 256, use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    x = tf.keras.layers.Reshape((7, 7, 256))(x)
    
    # First transpose conv (7x7x256 -> 14x14x128)
    x = tf.keras.layers.Conv2DTranspose(128, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    
    # Second transpose conv (14x14x128 -> 28x28x1)
    x = tf.keras.layers.Conv2DTranspose(1, 5, strides=2, padding='same', use_bias=False, activation='tanh')(x)
    
    model = tf.keras.Model(
        inputs=[noise_input, label_input],
        outputs=x,
        name='baseline_wgan_generator'
    )
    
    return model

# Enhanced WGAN Critic with deeper architecture
def build_enhanced_wgan_critic(img_height=28, img_width=28, num_classes=16):
    """Build enhanced WGAN critic with deeper architecture"""
    
    # Image input
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1), name='img_input')
    
    # Label input
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process image with deeper architecture
    x = img_input
    
    # First conv block (28x28x1 -> 14x14x64)
    x = tf.keras.layers.Conv2D(64, 4, strides=2, padding='same')(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Second conv block (14x14x64 -> 7x7x128)
    x = tf.keras.layers.Conv2D(128, 4, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Third conv block (7x7x128 -> 4x4x256)
    x = tf.keras.layers.Conv2D(256, 4, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Fourth conv block (4x4x256 -> 2x2x512) - Enhanced depth
    x = tf.keras.layers.Conv2D(512, 4, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.4)(x)
    
    # Flatten for dense layers
    x = tf.keras.layers.Flatten()(x)
    
    # Process label
    label_embedding = tf.keras.layers.Embedding(num_classes, 100)(label_input)  # Larger embedding
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate image features and label
    x = tf.keras.layers.Concatenate()([x, label_embedding])
    
    # Enhanced dense layers
    x = tf.keras.layers.Dense(1024)(x)  # Larger capacity
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    x = tf.keras.layers.Dense(512)(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    # Output layers - NO SIGMOID for WGAN
    validity = tf.keras.layers.Dense(1, name='validity')(x)  # Raw scores
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(x)
    
    model = tf.keras.Model(
        inputs=[img_input, label_input],
        outputs=[validity, label_pred],
        name='enhanced_wgan_critic'
    )
    
    return model

# Enhanced WGAN Generator with deeper architecture
def build_enhanced_wgan_generator(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """Build enhanced WGAN generator with deeper architecture"""
    
    # Noise input
    noise_input = tf.keras.layers.Input(shape=(latent_dim,), name='noise_input')
    
    # Label input
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process label with larger embedding
    label_embedding = tf.keras.layers.Embedding(num_classes, 100)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate noise and label
    x = tf.keras.layers.Concatenate()([noise_input, label_embedding])
    
    # Enhanced dense preprocessing
    x = tf.keras.layers.Dense(4 * 4 * 1024, use_bias=False)(x)  # Start smaller, go deeper
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    x = tf.keras.layers.Reshape((4, 4, 1024))(x)
    
    # First transpose conv (4x4x1024 -> 7x7x512)
    x = tf.keras.layers.Conv2DTranspose(512, 4, strides=1, padding='valid', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    
    # Second transpose conv (7x7x512 -> 14x14x256)
    x = tf.keras.layers.Conv2DTranspose(256, 4, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    
    # Third transpose conv (14x14x256 -> 28x28x128)
    x = tf.keras.layers.Conv2DTranspose(128, 4, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    
    # Final layer (28x28x128 -> 28x28x1)
    x = tf.keras.layers.Conv2DTranspose(1, 4, strides=1, padding='same', use_bias=False, activation='tanh')(x)
    
    model = tf.keras.Model(
        inputs=[noise_input, label_input],
        outputs=x,
        name='enhanced_wgan_generator'
    )
    
    return model

baseline_wgan_generator = build_baseline_wgan_generator(
    latent_dim=100, 
    num_classes=num_classes,
    img_height=28, 
    img_width=28
)

baseline_wgan_critic = build_baseline_wgan_critic(
    img_height=28, 
    img_width=28, 
    num_classes=num_classes
)

# Build Enhanced WGAN models
enhanced_wgan_generator = build_enhanced_wgan_generator(
    latent_dim=100, 
    num_classes=num_classes,
    img_height=28, 
    img_width=28
)

enhanced_wgan_critic = build_enhanced_wgan_critic(
    img_height=28, 
    img_width=28, 
    num_classes=num_classes
)
In [23]:
# =============================================================================
# WGAN TRAINER CLASS WITH GRADIENT PENALTY
# =============================================================================

class WGANTrainer:
    """WGAN-GP trainer with gradient penalty for stable training"""
    
    def __init__(self, generator, critic, latent_dim=100, num_classes=16, gradient_penalty_weight=10.0):
        self.generator = generator
        self.critic = critic
        self.latent_dim = latent_dim
        self.num_classes = num_classes
        self.gradient_penalty_weight = gradient_penalty_weight
        
        # Optimizers - different learning rates for generator and critic
        self.gen_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001, beta_1=0.5, beta_2=0.9)
        self.critic_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0004, beta_1=0.5, beta_2=0.9)
        
        # Loss function for label classification
        self.categorical_loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
        
        # Training parameters
        self.n_critic = 5  # Train critic 5 times per generator training
        
        # Metrics for tracking
        self.gen_loss_metric = tf.keras.metrics.Mean(name='gen_loss')
        self.critic_loss_metric = tf.keras.metrics.Mean(name='critic_loss')
        self.wasserstein_loss_metric = tf.keras.metrics.Mean(name='wasserstein_loss')
        self.gp_loss_metric = tf.keras.metrics.Mean(name='gp_loss')
        self.acc_metric = tf.keras.metrics.SparseCategoricalAccuracy(name='label_accuracy')
        
        # Training history
        self.history = {
            'gen_loss': [],
            'critic_loss': [],
            'wasserstein_loss': [],
            'gp_loss': [],
            'label_accuracy': [],
            'epoch': []
        }
    
    def gradient_penalty(self, real_images, fake_images, real_labels):
        """Calculate gradient penalty for WGAN-GP"""
        batch_size = tf.shape(real_images)[0]
        
        # Random interpolation factor
        alpha = tf.random.uniform([batch_size, 1, 1, 1], 0.0, 1.0)
        
        # Interpolate between real and fake images
        interpolated = alpha * real_images + (1 - alpha) * fake_images
        
        with tf.GradientTape() as tape:
            tape.watch(interpolated)
            # Get critic scores for interpolated images
            critic_output, _ = self.critic([interpolated, real_labels], training=True)
        
        # Calculate gradients of critic output w.r.t. interpolated images
        gradients = tape.gradient(critic_output, interpolated)
        
        # Calculate gradient norm
        gradients_norm = tf.sqrt(tf.reduce_sum(tf.square(gradients), axis=[1, 2, 3]))
        
        # Gradient penalty: (||grad|| - 1)^2
        gradient_penalty = tf.reduce_mean(tf.square(gradients_norm - 1.0))
        
        return gradient_penalty
    
    @tf.function
    def train_critic(self, real_images, real_labels, batch_size):
        """Train critic with Wasserstein loss and gradient penalty"""
        # Generate fake images
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], 0, self.num_classes, dtype=tf.int32)
        fake_images = self.generator([noise, fake_labels], training=True)
        
        with tf.GradientTape() as tape:
            # Get critic scores for real images
            real_validity, real_label_pred = self.critic([real_images, real_labels], training=True)
            
            # Get critic scores for fake images
            fake_validity, fake_label_pred = self.critic([fake_images, fake_labels], training=True)
            
            # Wasserstein loss: maximize D(real) - D(fake)
            # For minimization: minimize -(D(real) - D(fake)) = minimize (D(fake) - D(real))
            wasserstein_loss = tf.reduce_mean(fake_validity) - tf.reduce_mean(real_validity)
            
            # Gradient penalty
            gp_loss = self.gradient_penalty(real_images, fake_images, real_labels)
            
            # Label classification losses
            real_label_loss = self.categorical_loss(real_labels, real_label_pred)
            fake_label_loss = self.categorical_loss(fake_labels, fake_label_pred)
            label_loss = (real_label_loss + fake_label_loss) / 2
            
            # Total critic loss
            critic_loss = wasserstein_loss + self.gradient_penalty_weight * gp_loss + label_loss
        
        # Update critic
        gradients = tape.gradient(critic_loss, self.critic.trainable_variables)
        self.critic_optimizer.apply_gradients(zip(gradients, self.critic.trainable_variables))
        
        # Update metrics
        self.critic_loss_metric.update_state(critic_loss)
        self.wasserstein_loss_metric.update_state(wasserstein_loss)
        self.gp_loss_metric.update_state(gp_loss)
        self.acc_metric.update_state(real_labels, real_label_pred)
        
        return critic_loss, wasserstein_loss, gp_loss
    
    @tf.function
    def train_generator(self, batch_size):
        """Train generator to fool the critic"""
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], 0, self.num_classes, dtype=tf.int32)
        
        with tf.GradientTape() as tape:
            fake_images = self.generator([noise, fake_labels], training=True)
            fake_validity, fake_label_pred = self.critic([fake_images, fake_labels], training=True)
            
            # Generator wants to maximize critic score for fake images
            # For minimization: minimize -D(fake)
            adversarial_loss = -tf.reduce_mean(fake_validity)
            
            # Generator wants correct label classification
            label_loss = self.categorical_loss(fake_labels, fake_label_pred)
            
            # Total generator loss
            gen_loss = adversarial_loss + label_loss
        
        # Update generator
        gradients = tape.gradient(gen_loss, self.generator.trainable_variables)
        self.gen_optimizer.apply_gradients(zip(gradients, self.generator.trainable_variables))
        
        # Update metrics
        self.gen_loss_metric.update_state(gen_loss)
        
        return gen_loss
    
    def train_epoch(self, dataset, epoch, steps_per_epoch):
        """Train for one epoch with WGAN-GP strategy"""
        print(f"\nEpoch {epoch + 1}")
        print("-" * 60)
        
        # Reset metrics
        self.gen_loss_metric.reset_states()
        self.critic_loss_metric.reset_states()
        self.wasserstein_loss_metric.reset_states()
        self.gp_loss_metric.reset_states()
        self.acc_metric.reset_states()
        
        # Training loop
        for step, (real_images, real_labels) in enumerate(dataset.take(steps_per_epoch)):
            batch_size = tf.shape(real_images)[0]
            
            # Train critic multiple times per generator training
            for _ in range(self.n_critic):
                critic_loss, wasserstein_loss, gp_loss = self.train_critic(real_images, real_labels, batch_size)
            
            # Train generator once
            gen_loss = self.train_generator(batch_size)
            
            # Print progress
            if step % 100 == 0:
                print(f"Step {step:4d}/{steps_per_epoch} - "
                      f"Gen Loss: {self.gen_loss_metric.result():.4f}, "
                      f"Critic Loss: {self.critic_loss_metric.result():.4f}, "
                      f"W-Loss: {self.wasserstein_loss_metric.result():.4f}, "
                      f"GP: {self.gp_loss_metric.result():.4f}, "
                      f"Acc: {self.acc_metric.result():.4f}")
        
        # Store epoch results
        self.history['gen_loss'].append(float(self.gen_loss_metric.result()))
        self.history['critic_loss'].append(float(self.critic_loss_metric.result()))
        self.history['wasserstein_loss'].append(float(self.wasserstein_loss_metric.result()))
        self.history['gp_loss'].append(float(self.gp_loss_metric.result()))
        self.history['label_accuracy'].append(float(self.acc_metric.result()))
        self.history['epoch'].append(epoch)
        

# Baseline WGAN trainer
baseline_wgan_trainer_new = WGANTrainer(
    generator=baseline_wgan_generator,
    critic=baseline_wgan_critic,
    latent_dim=100,
    num_classes=num_classes,
    gradient_penalty_weight=10.0
)

# Enhanced WGAN trainer
enhanced_wgan_trainer_new = WGANTrainer(
    generator=enhanced_wgan_generator,
    critic=enhanced_wgan_critic,
    latent_dim=100,
    num_classes=num_classes,
    gradient_penalty_weight=10.0
)
In [24]:
# ============================================================================= 
# BASELINE WGAN TRAINING 
# =============================================================================

# Test visualization first
print("Testing visualization with baseline WGAN models...")
test_images_baseline, test_labels_baseline = display_generated_samples_grid(
    baseline_wgan_generator, class_to_letter, samples_per_class=6
)

# Training configuration
NUM_EPOCHS = 50

start_time = time.time()

for epoch in range(NUM_EPOCHS):
    epoch_start = time.time()
    
    # Train for one epoch with WGAN-GP strategy
    baseline_wgan_trainer_new.train_epoch(train_dataset, epoch, steps_per_epoch)
    
    # Display only on the final epoch
    if (epoch + 1) == NUM_EPOCHS:
        print(f"\nGenerating Grid of Samples at Final Epoch {epoch + 1}:")
        display_generated_samples_grid(baseline_wgan_generator, class_to_letter, epoch + 1, samples_per_class=6)
    
    # Calculate and display epoch timing
    epoch_time = time.time() - epoch_start
    total_time = time.time() - start_time
    avg_time = total_time / (epoch + 1)
    eta = avg_time * (NUM_EPOCHS - epoch - 1)

    print("-" * 70)

total_training_time = time.time() - start_time

# Generate final display
print(f"\nFinal Generated Samples - All Letter Classes:")
final_images_baseline, final_labels_baseline = display_generated_samples_grid(
    baseline_wgan_generator, class_to_letter, NUM_EPOCHS, samples_per_class=6
)

# Display comprehensive training summary
available_letters = sorted(class_to_letter.values())
missing_letters = [letter for letter in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' if letter not in available_letters]

# Plot training progress
if len(baseline_wgan_trainer_new.history['gen_loss']) > 1:
    plt.figure(figsize=(15, 5))
    
    # Generator and Critic Loss
    plt.subplot(1, 3, 1)
    epochs = baseline_wgan_trainer_new.history['epoch']
    plt.plot(epochs, baseline_wgan_trainer_new.history['gen_loss'], label='Generator Loss', color='blue', linewidth=2)
    plt.plot(epochs, baseline_wgan_trainer_new.history['critic_loss'], label='Critic Loss', color='red', linewidth=2)
    plt.title('Baseline WGAN - Training Losses', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Critic Accuracy
    plt.subplot(1, 3, 2)
    plt.plot(epochs, baseline_wgan_trainer_new.history['label_accuracy'], label='Critic Accuracy', color='green', linewidth=2)
    plt.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='Upper limit (95%)')
    plt.axhline(y=0.70, color='orange', linestyle='--', alpha=0.7, label='Lower limit (70%)')
    plt.title('Baseline WGAN - Critic Accuracy', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Loss Difference
    plt.subplot(1, 3, 3)
    loss_diff = [g - c for g, c in zip(baseline_wgan_trainer_new.history['gen_loss'], baseline_wgan_trainer_new.history['critic_loss'])]
    plt.plot(epochs, loss_diff, label='Gen Loss - Critic Loss', color='purple', linewidth=2)
    plt.axhline(y=0, color='black', linestyle='-', alpha=0.5)
    plt.title('Baseline WGAN - Loss Balance', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss Difference')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
Testing visualization with baseline WGAN models...
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Epoch 1
------------------------------------------------------------
Step    0/597 - Gen Loss: 8.6118, Critic Loss: -0.0681, W-Loss: -7.1122, GP: 0.4398, Acc: 0.3625
Step  100/597 - Gen Loss: 18.3094, Critic Loss: -131.0980, W-Loss: -216.4599, GP: 8.4363, Acc: 0.8805
Step  200/597 - Gen Loss: -5.7413, Critic Loss: -111.7363, W-Loss: -195.4327, GP: 8.3175, Acc: 0.9400
Step  300/597 - Gen Loss: -23.6389, Critic Loss: -147.2175, W-Loss: -245.1508, GP: 9.7581, Acc: 0.9599
Step  400/597 - Gen Loss: -25.3361, Critic Loss: -288.6538, W-Loss: -442.6199, GP: 15.3698, Acc: 0.9692
Step  500/597 - Gen Loss: -10.2545, Critic Loss: -589.1346, W-Loss: -858.3094, GP: 26.8951, Acc: 0.9723
----------------------------------------------------------------------

Epoch 2
------------------------------------------------------------
Step    0/597 - Gen Loss: 785.8337, Critic Loss: -4799.9863, W-Loss: -5823.1958, GP: 102.3207, Acc: 1.0000
Step  100/597 - Gen Loss: 1369.5128, Critic Loss: -7690.2207, W-Loss: -9629.4365, GP: 193.8833, Acc: 0.9232
Step  200/597 - Gen Loss: 2410.6033, Critic Loss: -11294.0537, W-Loss: -13858.0303, GP: 256.3205, Acc: 0.8892
Step  300/597 - Gen Loss: 4086.2915, Critic Loss: -16035.1670, W-Loss: -19430.1934, GP: 339.3728, Acc: 0.8651
Step  400/597 - Gen Loss: 6596.3428, Critic Loss: -23029.1172, W-Loss: -27382.4023, GP: 435.1331, Acc: 0.8531
Step  500/597 - Gen Loss: 10392.0439, Critic Loss: -32416.7539, W-Loss: -38219.3711, GP: 579.9567, Acc: 0.8136
----------------------------------------------------------------------

Epoch 3
------------------------------------------------------------
Step    0/597 - Gen Loss: 53448.3477, Critic Loss: -143933.1250, W-Loss: -153771.0938, GP: 982.2769, Acc: 0.2875
Step  100/597 - Gen Loss: 64274.5898, Critic Loss: -157272.9844, W-Loss: -177378.6406, GP: 2008.8613, Acc: 0.3061
Step  200/597 - Gen Loss: 80810.6094, Critic Loss: -192774.8906, W-Loss: -215255.9688, GP: 2245.9971, Acc: 0.2831
Step  300/597 - Gen Loss: 100183.0781, Critic Loss: -233923.0781, W-Loss: -260321.1250, GP: 2637.1680, Acc: 0.2577
Step  400/597 - Gen Loss: 115864.8047, Critic Loss: -179966.7031, W-Loss: -204393.4688, GP: 2439.7349, Acc: 0.2631
Step  500/597 - Gen Loss: 122804.7031, Critic Loss: -144031.6250, W-Loss: -163596.2500, GP: 1953.1084, Acc: 0.2638
----------------------------------------------------------------------

Epoch 4
------------------------------------------------------------
Step    0/597 - Gen Loss: 131942.4219, Critic Loss: 182.7888, W-Loss: 147.8906, GP: 0.1280, Acc: 0.3031
Step  100/597 - Gen Loss: 123666.1172, Critic Loss: 33.2206, W-Loss: -3.7786, GP: 0.1388, Acc: 0.2721
Step  200/597 - Gen Loss: 116928.7344, Critic Loss: 24.1903, W-Loss: -7.2244, GP: 0.1358, Acc: 0.3149
Step  300/597 - Gen Loss: 110796.9219, Critic Loss: 19.8649, W-Loss: -6.5400, GP: 0.1364, Acc: 0.4033
Step  400/597 - Gen Loss: 106547.2656, Critic Loss: 14.2029, W-Loss: -7.3097, GP: 0.1323, Acc: 0.5267
Step  500/597 - Gen Loss: 103389.3438, Critic Loss: 12.7054, W-Loss: -6.2687, GP: 0.1276, Acc: 0.5887
----------------------------------------------------------------------

Epoch 5
------------------------------------------------------------
Step    0/597 - Gen Loss: 82824.9219, Critic Loss: -71.1711, W-Loss: -72.0469, GP: 0.0876, Acc: 1.0000
Step  100/597 - Gen Loss: 81744.9062, Critic Loss: -12.6455, W-Loss: -16.7547, GP: 0.0964, Acc: 0.9294
Step  200/597 - Gen Loss: 81019.9766, Critic Loss: -8.1706, W-Loss: -12.7414, GP: 0.0929, Acc: 0.9121
Step  300/597 - Gen Loss: 79900.0547, Critic Loss: -5.4919, W-Loss: -9.8500, GP: 0.0916, Acc: 0.9175
Step  400/597 - Gen Loss: 78710.3203, Critic Loss: -2.9246, W-Loss: -6.9733, GP: 0.0906, Acc: 0.9229
Step  500/597 - Gen Loss: 77471.7500, Critic Loss: -1.0269, W-Loss: -5.0524, GP: 0.0892, Acc: 0.9196
----------------------------------------------------------------------

Epoch 6
------------------------------------------------------------
Step    0/597 - Gen Loss: 68433.4375, Critic Loss: -48.0617, W-Loss: -48.8937, GP: 0.0832, Acc: 1.0000
Step  100/597 - Gen Loss: 66745.4531, Critic Loss: 7.3389, W-Loss: 4.6892, GP: 0.0829, Acc: 0.9484
Step  200/597 - Gen Loss: 65809.2500, Critic Loss: -1.6725, W-Loss: -4.3446, GP: 0.0845, Acc: 0.9480
Step  300/597 - Gen Loss: 64655.0156, Critic Loss: 0.1449, W-Loss: -2.5453, GP: 0.0875, Acc: 0.9443
Step  400/597 - Gen Loss: 63561.3008, Critic Loss: -2.2382, W-Loss: -4.8428, GP: 0.0887, Acc: 0.9466
Step  500/597 - Gen Loss: 62480.7617, Critic Loss: -0.1807, W-Loss: -2.6742, GP: 0.0902, Acc: 0.9497
----------------------------------------------------------------------

Epoch 7
------------------------------------------------------------
Step    0/597 - Gen Loss: 55855.4375, Critic Loss: -224.6918, W-Loss: -225.5312, GP: 0.0839, Acc: 1.0000
Step  100/597 - Gen Loss: 54200.5195, Critic Loss: 3.3992, W-Loss: 1.0956, GP: 0.1026, Acc: 0.9485
Step  200/597 - Gen Loss: 53511.0156, Critic Loss: -1.1777, W-Loss: -3.2416, GP: 0.1070, Acc: 0.9592
Step  300/597 - Gen Loss: 52795.7852, Critic Loss: -4.7068, W-Loss: -6.7331, GP: 0.1120, Acc: 0.9621
Step  400/597 - Gen Loss: 52049.6914, Critic Loss: -4.2902, W-Loss: -6.3235, GP: 0.1158, Acc: 0.9632
Step  500/597 - Gen Loss: 51341.5391, Critic Loss: -5.3060, W-Loss: -7.3263, GP: 0.1184, Acc: 0.9644
----------------------------------------------------------------------

Epoch 8
------------------------------------------------------------
Step    0/597 - Gen Loss: 46082.0938, Critic Loss: -6.5016, W-Loss: -7.5836, GP: 0.1082, Acc: 1.0000
Step  100/597 - Gen Loss: 45453.9258, Critic Loss: -10.0887, W-Loss: -11.9534, GP: 0.1435, Acc: 0.9821
Step  200/597 - Gen Loss: 44821.9062, Critic Loss: -0.3133, W-Loss: -2.1979, GP: 0.1451, Acc: 0.9799
Step  300/597 - Gen Loss: 44156.4961, Critic Loss: -8.4548, W-Loss: -10.4012, GP: 0.1532, Acc: 0.9798
Step  400/597 - Gen Loss: 43582.7266, Critic Loss: -6.8840, W-Loss: -8.8439, GP: 0.1560, Acc: 0.9803
Step  500/597 - Gen Loss: 43056.7266, Critic Loss: -8.9574, W-Loss: -10.9321, GP: 0.1592, Acc: 0.9810
----------------------------------------------------------------------

Epoch 9
------------------------------------------------------------
Step    0/597 - Gen Loss: 39955.4141, Critic Loss: -221.9049, W-Loss: -233.0242, GP: 1.1119, Acc: 1.0000
Step  100/597 - Gen Loss: 39580.4180, Critic Loss: -12.5835, W-Loss: -22.4590, GP: 0.9659, Acc: 0.9878
Step  200/597 - Gen Loss: 39583.8242, Critic Loss: -35.3202, W-Loss: -50.0761, GP: 1.4553, Acc: 0.9877
Step  300/597 - Gen Loss: 39492.7070, Critic Loss: -43.3455, W-Loss: -63.4082, GP: 1.9867, Acc: 0.9873
Step  400/597 - Gen Loss: 39221.8945, Critic Loss: -49.8007, W-Loss: -74.6142, GP: 2.4626, Acc: 0.9878
Step  500/597 - Gen Loss: 38839.8398, Critic Loss: -51.4971, W-Loss: -81.2672, GP: 2.9592, Acc: 0.9878
----------------------------------------------------------------------

Epoch 10
------------------------------------------------------------
Step    0/597 - Gen Loss: 35983.1406, Critic Loss: -208.2044, W-Loss: -270.7008, GP: 6.2496, Acc: 1.0000
Step  100/597 - Gen Loss: 35526.7266, Critic Loss: -124.3812, W-Loss: -221.4867, GP: 9.6946, Acc: 0.9844
Step  200/597 - Gen Loss: 35474.6289, Critic Loss: -163.2366, W-Loss: -277.7753, GP: 11.4418, Acc: 0.9878
Step  300/597 - Gen Loss: 35592.3164, Critic Loss: -241.2654, W-Loss: -384.4381, GP: 14.3044, Acc: 0.9873
Step  400/597 - Gen Loss: 36026.5547, Critic Loss: -351.4949, W-Loss: -540.9835, GP: 18.9355, Acc: 0.9863
Step  500/597 - Gen Loss: 36686.5391, Critic Loss: -576.9902, W-Loss: -833.5413, GP: 25.6399, Acc: 0.9829
----------------------------------------------------------------------

Epoch 11
------------------------------------------------------------
Step    0/597 - Gen Loss: 40466.0078, Critic Loss: -3723.9055, W-Loss: -4217.5986, GP: 49.3637, Acc: 0.9844
Step  100/597 - Gen Loss: 40011.4570, Critic Loss: -2825.7065, W-Loss: -3896.4404, GP: 107.0596, Acc: 0.9802
Step  200/597 - Gen Loss: 41150.6992, Critic Loss: -3373.3899, W-Loss: -4626.2441, GP: 125.2759, Acc: 0.9857
Step  300/597 - Gen Loss: 42461.7930, Critic Loss: -3939.3069, W-Loss: -5420.0234, GP: 148.0629, Acc: 0.9873
Step  400/597 - Gen Loss: 43854.3203, Critic Loss: -4701.5430, W-Loss: -6465.3105, GP: 176.3694, Acc: 0.9891
Step  500/597 - Gen Loss: 44866.2891, Critic Loss: -5150.9688, W-Loss: -7156.8877, GP: 200.5853, Acc: 0.9899
----------------------------------------------------------------------

Epoch 12
------------------------------------------------------------
Step    0/597 - Gen Loss: 54691.8438, Critic Loss: -8829.5020, W-Loss: -12615.6104, GP: 378.6109, Acc: 1.0000
Step  100/597 - Gen Loss: 52560.1836, Critic Loss: -9675.3105, W-Loss: -14514.8125, GP: 483.9417, Acc: 0.9920
Step  200/597 - Gen Loss: 52307.4961, Critic Loss: -10896.4873, W-Loss: -15919.7705, GP: 502.3228, Acc: 0.9923
Step  300/597 - Gen Loss: 53303.7891, Critic Loss: -11906.7588, W-Loss: -17472.1152, GP: 556.5298, Acc: 0.9916
Step  400/597 - Gen Loss: 53943.9219, Critic Loss: -13383.7207, W-Loss: -19338.6133, GP: 595.4824, Acc: 0.9915
Step  500/597 - Gen Loss: 54947.5391, Critic Loss: -14196.6230, W-Loss: -20794.8320, GP: 659.8150, Acc: 0.9916
----------------------------------------------------------------------

Epoch 13
------------------------------------------------------------
Step    0/597 - Gen Loss: 64298.0273, Critic Loss: 169.6051, W-Loss: 61.1773, GP: 10.8428, Acc: 1.0000
Step  100/597 - Gen Loss: 61951.6602, Critic Loss: 65.4824, W-Loss: 41.7421, GP: 2.3638, Acc: 0.9944
Step  200/597 - Gen Loss: 60066.3867, Critic Loss: 28.6504, W-Loss: 15.5367, GP: 1.3062, Acc: 0.9972
Step  300/597 - Gen Loss: 58652.5430, Critic Loss: -6.6721, W-Loss: -15.8388, GP: 0.9132, Acc: 0.9981
Step  400/597 - Gen Loss: 57501.1484, Critic Loss: 3.5672, W-Loss: -3.4640, GP: 0.7006, Acc: 0.9986
Step  500/597 - Gen Loss: 56432.2930, Critic Loss: 0.8516, W-Loss: -4.8481, GP: 0.5679, Acc: 0.9989
----------------------------------------------------------------------

Epoch 14
------------------------------------------------------------
Step    0/597 - Gen Loss: 50530.6797, Critic Loss: 726.9702, W-Loss: 726.6813, GP: 0.0289, Acc: 1.0000
Step  100/597 - Gen Loss: 48884.4375, Critic Loss: 1.5188, W-Loss: 1.0236, GP: 0.0359, Acc: 0.9945
Step  200/597 - Gen Loss: 47393.9688, Critic Loss: 1.5815, W-Loss: 1.1915, GP: 0.0321, Acc: 0.9972
Step  300/597 - Gen Loss: 45935.0195, Critic Loss: -19.5610, W-Loss: -19.9265, GP: 0.0320, Acc: 0.9982
Step  400/597 - Gen Loss: 44486.7891, Critic Loss: -14.6572, W-Loss: -15.0170, GP: 0.0325, Acc: 0.9986
Step  500/597 - Gen Loss: 42967.6484, Critic Loss: -19.8738, W-Loss: -20.2370, GP: 0.0336, Acc: 0.9989
----------------------------------------------------------------------

Epoch 15
------------------------------------------------------------
Step    0/597 - Gen Loss: 32209.4844, Critic Loss: 201.9582, W-Loss: 201.5918, GP: 0.0366, Acc: 1.0000
Step  100/597 - Gen Loss: 31473.4844, Critic Loss: -11.6977, W-Loss: -14.3971, GP: 0.2518, Acc: 0.9929
Step  200/597 - Gen Loss: 31188.2344, Critic Loss: -5.6912, W-Loss: -8.8662, GP: 0.3084, Acc: 0.9965
Step  300/597 - Gen Loss: 30813.0273, Critic Loss: -24.1527, W-Loss: -29.3722, GP: 0.5099, Acc: 0.9949
Step  400/597 - Gen Loss: 30388.5762, Critic Loss: -17.5597, W-Loss: -26.1260, GP: 0.8423, Acc: 0.9938
Step  500/597 - Gen Loss: 29887.2422, Critic Loss: -20.7838, W-Loss: -34.2993, GP: 1.3368, Acc: 0.9933
----------------------------------------------------------------------

Epoch 16
------------------------------------------------------------
Step    0/597 - Gen Loss: 28278.0957, Critic Loss: 191.0257, W-Loss: 174.8680, GP: 1.6158, Acc: 1.0000
Step  100/597 - Gen Loss: 26865.5996, Critic Loss: -28.8465, W-Loss: -56.3607, GP: 2.7287, Acc: 0.9860
Step  200/597 - Gen Loss: 26421.2305, Critic Loss: -28.8572, W-Loss: -56.2524, GP: 2.7208, Acc: 0.9904
Step  300/597 - Gen Loss: 26248.5078, Critic Loss: -81.8056, W-Loss: -110.4927, GP: 2.8463, Acc: 0.9886
Step  400/597 - Gen Loss: 25958.9121, Critic Loss: -58.1378, W-Loss: -89.1676, GP: 3.0823, Acc: 0.9896
Step  500/597 - Gen Loss: 25527.1641, Critic Loss: -57.3645, W-Loss: -90.5654, GP: 3.2998, Acc: 0.9894
----------------------------------------------------------------------

Epoch 17
------------------------------------------------------------
Step    0/597 - Gen Loss: 22468.8379, Critic Loss: -974.5160, W-Loss: -1004.5875, GP: 3.0071, Acc: 1.0000
Step  100/597 - Gen Loss: 21297.4336, Critic Loss: 37.7225, W-Loss: -24.7019, GP: 6.2282, Acc: 0.9901
Step  200/597 - Gen Loss: 20398.5938, Critic Loss: -30.7132, W-Loss: -94.3902, GP: 6.3506, Acc: 0.9894
Step  300/597 - Gen Loss: 20012.5527, Critic Loss: -86.0829, W-Loss: -165.0127, GP: 7.8785, Acc: 0.9912
Step  400/597 - Gen Loss: 19383.4941, Critic Loss: -109.6275, W-Loss: -224.2629, GP: 11.4501, Acc: 0.9908
Step  500/597 - Gen Loss: 18766.6855, Critic Loss: -190.6171, W-Loss: -373.5858, GP: 18.2835, Acc: 0.9906
----------------------------------------------------------------------

Epoch 18
------------------------------------------------------------
Step    0/597 - Gen Loss: 12775.6738, Critic Loss: 218.7244, W-Loss: -2532.2234, GP: 275.0090, Acc: 0.9375
Step  100/597 - Gen Loss: 15653.8428, Critic Loss: -4771.5093, W-Loss: -5501.5503, GP: 72.9850, Acc: 0.9836
Step  200/597 - Gen Loss: 13932.1719, Critic Loss: -3249.5837, W-Loss: -3947.1013, GP: 69.7375, Acc: 0.9869
Step  300/597 - Gen Loss: 13386.9785, Critic Loss: -2572.1455, W-Loss: -3314.0012, GP: 74.1709, Acc: 0.9868
Step  400/597 - Gen Loss: 12961.7354, Critic Loss: -2203.0273, W-Loss: -2928.3164, GP: 72.5125, Acc: 0.9850
Step  500/597 - Gen Loss: 12599.8535, Critic Loss: -2160.9080, W-Loss: -2892.1387, GP: 73.1053, Acc: 0.9837
----------------------------------------------------------------------

Epoch 19
------------------------------------------------------------
Step    0/597 - Gen Loss: 10893.4043, Critic Loss: -1236.2723, W-Loss: -2041.9996, GP: 80.5727, Acc: 1.0000
Step  100/597 - Gen Loss: 8389.4385, Critic Loss: -2192.0007, W-Loss: -3095.6504, GP: 90.3539, Acc: 0.9869
Step  200/597 - Gen Loss: 8790.1279, Critic Loss: -2052.4526, W-Loss: -2840.9578, GP: 78.8397, Acc: 0.9864
Step  300/597 - Gen Loss: 9053.3643, Critic Loss: -2315.1902, W-Loss: -3195.2346, GP: 87.9935, Acc: 0.9865
Step  400/597 - Gen Loss: 9088.3594, Critic Loss: -2758.5647, W-Loss: -3653.7424, GP: 89.5072, Acc: 0.9861
Step  500/597 - Gen Loss: 9188.3438, Critic Loss: -2654.6758, W-Loss: -3598.1545, GP: 94.3380, Acc: 0.9871
----------------------------------------------------------------------

Epoch 20
------------------------------------------------------------
Step    0/597 - Gen Loss: 5328.5742, Critic Loss: -2334.5723, W-Loss: -2575.9062, GP: 24.1334, Acc: 1.0000
Step  100/597 - Gen Loss: 6199.2500, Critic Loss: -293.4164, W-Loss: -816.3134, GP: 52.2832, Acc: 0.9911
Step  200/597 - Gen Loss: 5148.7534, Critic Loss: -269.5768, W-Loss: -786.8328, GP: 51.7190, Acc: 0.9914
Step  300/597 - Gen Loss: 4591.6802, Critic Loss: -304.8382, W-Loss: -848.0193, GP: 54.3111, Acc: 0.9918
Step  400/597 - Gen Loss: 3988.5623, Critic Loss: -307.8248, W-Loss: -902.1447, GP: 59.4255, Acc: 0.9922
Step  500/597 - Gen Loss: 3752.5276, Critic Loss: -385.8986, W-Loss: -1011.6339, GP: 62.5672, Acc: 0.9922
----------------------------------------------------------------------

Epoch 21
------------------------------------------------------------
Step    0/597 - Gen Loss: 1439.5823, Critic Loss: -1402.5095, W-Loss: -2405.4614, GP: 100.2501, Acc: 0.9094
Step  100/597 - Gen Loss: -722.8683, Critic Loss: -925.8400, W-Loss: -2138.1533, GP: 121.2275, Acc: 0.9933
Step  200/597 - Gen Loss: -911.6037, Critic Loss: -987.7186, W-Loss: -2128.6487, GP: 114.0891, Acc: 0.9939
Step  300/597 - Gen Loss: -1446.6743, Critic Loss: -1164.7976, W-Loss: -2256.2881, GP: 109.1444, Acc: 0.9933
Step  400/597 - Gen Loss: -1763.0546, Critic Loss: -1420.9772, W-Loss: -2533.4402, GP: 111.2415, Acc: 0.9936
Step  500/597 - Gen Loss: -1744.3406, Critic Loss: -1673.1755, W-Loss: -2826.5208, GP: 115.3303, Acc: 0.9944
----------------------------------------------------------------------

Epoch 22
------------------------------------------------------------
Step    0/597 - Gen Loss: 1823.4227, Critic Loss: -5201.3643, W-Loss: -6009.9463, GP: 80.8582, Acc: 1.0000
Step  100/597 - Gen Loss: -1987.2332, Critic Loss: -4373.2886, W-Loss: -6122.8281, GP: 174.9476, Acc: 0.9929
Step  200/597 - Gen Loss: -2070.6851, Critic Loss: -4621.4448, W-Loss: -6519.8525, GP: 189.8357, Acc: 0.9930
Step  300/597 - Gen Loss: -1436.7054, Critic Loss: -4857.4111, W-Loss: -6898.3184, GP: 204.0865, Acc: 0.9933
Step  400/597 - Gen Loss: -685.1618, Critic Loss: -5368.0068, W-Loss: -7570.5142, GP: 220.2467, Acc: 0.9935
Step  500/597 - Gen Loss: -16.6916, Critic Loss: -5378.2769, W-Loss: -7705.0688, GP: 232.6748, Acc: 0.9939
----------------------------------------------------------------------

Epoch 23
------------------------------------------------------------
Step    0/597 - Gen Loss: 4238.7344, Critic Loss: -12920.6396, W-Loss: -15076.1768, GP: 215.5537, Acc: 1.0000
Step  100/597 - Gen Loss: 4267.5850, Critic Loss: -10146.2529, W-Loss: -14186.6455, GP: 404.0372, Acc: 0.9970
Step  200/597 - Gen Loss: 5342.1167, Critic Loss: -10810.9219, W-Loss: -15075.5078, GP: 426.4557, Acc: 0.9950
Step  300/597 - Gen Loss: 5844.2065, Critic Loss: -10984.9346, W-Loss: -15607.9766, GP: 462.3016, Acc: 0.9953
Step  400/597 - Gen Loss: 7355.7334, Critic Loss: -12110.5371, W-Loss: -16949.5762, GP: 483.9005, Acc: 0.9951
Step  500/597 - Gen Loss: 8642.3789, Critic Loss: -13247.9766, W-Loss: -18416.4023, GP: 516.8376, Acc: 0.9949
----------------------------------------------------------------------

Epoch 24
------------------------------------------------------------
Step    0/597 - Gen Loss: 25681.4609, Critic Loss: -17869.1953, W-Loss: -22885.0273, GP: 501.5831, Acc: 1.0000
Step  100/597 - Gen Loss: 17861.1836, Critic Loss: -20373.1055, W-Loss: -28777.7090, GP: 840.4575, Acc: 0.9933
Step  200/597 - Gen Loss: 18179.0527, Critic Loss: -21859.8340, W-Loss: -30401.9922, GP: 854.2130, Acc: 0.9937
Step  300/597 - Gen Loss: 19209.4277, Critic Loss: -22978.5938, W-Loss: -32157.2598, GP: 917.8622, Acc: 0.9937
Step  400/597 - Gen Loss: 20591.2129, Critic Loss: -23698.5547, W-Loss: -33263.5273, GP: 956.4903, Acc: 0.9933
Step  500/597 - Gen Loss: 22367.2148, Critic Loss: -24556.3730, W-Loss: -34453.6016, GP: 989.7211, Acc: 0.9935
----------------------------------------------------------------------

Epoch 25
------------------------------------------------------------
Step    0/597 - Gen Loss: 38544.0977, Critic Loss: -24859.8203, W-Loss: -33337.7188, GP: 847.7897, Acc: 1.0000
Step  100/597 - Gen Loss: 37097.3359, Critic Loss: -30862.3203, W-Loss: -45747.5742, GP: 1488.5193, Acc: 0.9920
Step  200/597 - Gen Loss: 36067.7266, Critic Loss: -31835.2051, W-Loss: -47893.0742, GP: 1605.7770, Acc: 0.9913
Step  300/597 - Gen Loss: 38435.6094, Critic Loss: -33940.5312, W-Loss: -50297.2031, GP: 1635.6628, Acc: 0.9918
Step  400/597 - Gen Loss: 39918.9727, Critic Loss: -35243.4023, W-Loss: -51805.7695, GP: 1656.2356, Acc: 0.9925
Step  500/597 - Gen Loss: 41167.9219, Critic Loss: -36024.9414, W-Loss: -52895.0469, GP: 1687.0090, Acc: 0.9921
----------------------------------------------------------------------

Epoch 26
------------------------------------------------------------
Step    0/597 - Gen Loss: 57489.2852, Critic Loss: -72320.6250, W-Loss: -90305.8984, GP: 1798.5277, Acc: 1.0000
Step  100/597 - Gen Loss: 62676.2773, Critic Loss: -1255.0797, W-Loss: -4891.7769, GP: 363.6695, Acc: 1.0000
Step  200/597 - Gen Loss: 60726.1641, Critic Loss: -558.1940, W-Loss: -2394.0767, GP: 183.5880, Acc: 1.0000
Step  300/597 - Gen Loss: 59460.6172, Critic Loss: -463.0557, W-Loss: -1689.8739, GP: 122.6815, Acc: 1.0000
Step  400/597 - Gen Loss: 58524.0117, Critic Loss: -311.6929, W-Loss: -1232.8915, GP: 92.1196, Acc: 1.0000
Step  500/597 - Gen Loss: 57921.3398, Critic Loss: -302.5737, W-Loss: -1040.0468, GP: 73.7471, Acc: 1.0000
----------------------------------------------------------------------

Epoch 27
------------------------------------------------------------
Step    0/597 - Gen Loss: 59651.8555, Critic Loss: 561.2914, W-Loss: 560.5961, GP: 0.0695, Acc: 1.0000
Step  100/597 - Gen Loss: 54376.5195, Critic Loss: 157.3714, W-Loss: 156.8650, GP: 0.0506, Acc: 1.0000
Step  200/597 - Gen Loss: 53593.8477, Critic Loss: 148.1006, W-Loss: 147.6147, GP: 0.0486, Acc: 1.0000
Step  300/597 - Gen Loss: 53376.0625, Critic Loss: -6.7116, W-Loss: -7.1981, GP: 0.0486, Acc: 1.0000
Step  400/597 - Gen Loss: 52497.7148, Critic Loss: -21.2191, W-Loss: -21.7121, GP: 0.0493, Acc: 1.0000
Step  500/597 - Gen Loss: 51818.5898, Critic Loss: -101.7914, W-Loss: -102.2968, GP: 0.0505, Acc: 1.0000
----------------------------------------------------------------------

Epoch 28
------------------------------------------------------------
Step    0/597 - Gen Loss: 47610.8516, Critic Loss: -5891.5884, W-Loss: -5893.7969, GP: 0.2209, Acc: 1.0000
Step  100/597 - Gen Loss: 46027.9688, Critic Loss: 126.4455, W-Loss: 118.5152, GP: 0.7931, Acc: 1.0000
Step  200/597 - Gen Loss: 45523.9219, Critic Loss: -8.2591, W-Loss: -18.7424, GP: 1.0483, Acc: 1.0000
Step  300/597 - Gen Loss: 44623.9805, Critic Loss: -206.9651, W-Loss: -218.8379, GP: 1.1873, Acc: 1.0000
Step  400/597 - Gen Loss: 43987.5469, Critic Loss: -170.4484, W-Loss: -182.7485, GP: 1.2300, Acc: 1.0000
Step  500/597 - Gen Loss: 43607.7969, Critic Loss: -185.6228, W-Loss: -198.4662, GP: 1.2843, Acc: 1.0000
----------------------------------------------------------------------

Epoch 29
------------------------------------------------------------
Step    0/597 - Gen Loss: 40308.5312, Critic Loss: 685.1334, W-Loss: 659.6180, GP: 2.5515, Acc: 1.0000
Step  100/597 - Gen Loss: 39903.6328, Critic Loss: -306.0666, W-Loss: -321.8192, GP: 1.5753, Acc: 1.0000
Step  200/597 - Gen Loss: 38819.3320, Critic Loss: -183.3323, W-Loss: -198.0027, GP: 1.4671, Acc: 1.0000
Step  300/597 - Gen Loss: 37637.7500, Critic Loss: -357.0096, W-Loss: -371.3998, GP: 1.4390, Acc: 1.0000
Step  400/597 - Gen Loss: 36709.6250, Critic Loss: -204.5079, W-Loss: -219.2679, GP: 1.4760, Acc: 1.0000
Step  500/597 - Gen Loss: 35890.8242, Critic Loss: -283.3685, W-Loss: -298.3405, GP: 1.4972, Acc: 1.0000
----------------------------------------------------------------------

Epoch 30
------------------------------------------------------------
Step    0/597 - Gen Loss: 33856.6484, Critic Loss: 1762.5551, W-Loss: 1754.3011, GP: 0.8254, Acc: 1.0000
Step  100/597 - Gen Loss: 31335.6660, Critic Loss: 143.4762, W-Loss: 127.0455, GP: 1.6431, Acc: 1.0000
Step  200/597 - Gen Loss: 30253.5176, Critic Loss: 152.3405, W-Loss: 135.6001, GP: 1.6740, Acc: 1.0000
Step  300/597 - Gen Loss: 29790.4746, Critic Loss: -86.7352, W-Loss: -103.9012, GP: 1.7166, Acc: 1.0000
Step  400/597 - Gen Loss: 29367.7598, Critic Loss: 44.7897, W-Loss: 26.3045, GP: 1.8485, Acc: 1.0000
Step  500/597 - Gen Loss: 28728.3613, Critic Loss: -86.9621, W-Loss: -110.1171, GP: 2.3155, Acc: 1.0000
----------------------------------------------------------------------

Epoch 31
------------------------------------------------------------
Step    0/597 - Gen Loss: 26127.7070, Critic Loss: -5950.2651, W-Loss: -6005.5122, GP: 5.5247, Acc: 1.0000
Step  100/597 - Gen Loss: 27159.8691, Critic Loss: 323.3791, W-Loss: 296.5315, GP: 2.6847, Acc: 1.0000
Step  200/597 - Gen Loss: 27618.0156, Critic Loss: 190.8043, W-Loss: 162.9373, GP: 2.7867, Acc: 1.0000
Step  300/597 - Gen Loss: 27527.9766, Critic Loss: -164.2307, W-Loss: -194.2526, GP: 3.0022, Acc: 1.0000
Step  400/597 - Gen Loss: 27699.1094, Critic Loss: -148.7848, W-Loss: -182.4206, GP: 3.3636, Acc: 1.0000
Step  500/597 - Gen Loss: 27919.8184, Critic Loss: -135.9250, W-Loss: -174.2653, GP: 3.8340, Acc: 1.0000
----------------------------------------------------------------------

Epoch 32
------------------------------------------------------------
Step    0/597 - Gen Loss: 24876.6660, Critic Loss: 2397.8525, W-Loss: 2366.2551, GP: 3.1598, Acc: 1.0000
Step  100/597 - Gen Loss: 29160.4082, Critic Loss: 24.1367, W-Loss: -63.1435, GP: 8.7280, Acc: 1.0000
Step  200/597 - Gen Loss: 28443.8555, Critic Loss: 7.1769, W-Loss: -97.2432, GP: 10.4420, Acc: 1.0000
Step  300/597 - Gen Loss: 27937.0273, Critic Loss: -274.3630, W-Loss: -396.2932, GP: 12.1930, Acc: 1.0000
Step  400/597 - Gen Loss: 27293.8770, Critic Loss: -237.1922, W-Loss: -392.4036, GP: 15.5211, Acc: 1.0000
Step  500/597 - Gen Loss: 26500.9453, Critic Loss: -318.6702, W-Loss: -503.9710, GP: 18.5301, Acc: 1.0000
----------------------------------------------------------------------

Epoch 33
------------------------------------------------------------
Step    0/597 - Gen Loss: 21693.7109, Critic Loss: 1206.3599, W-Loss: 913.6668, GP: 29.2693, Acc: 1.0000
Step  100/597 - Gen Loss: 24410.6465, Critic Loss: -1565.3433, W-Loss: -2054.5747, GP: 48.9232, Acc: 1.0000
Step  200/597 - Gen Loss: 24390.7715, Critic Loss: -1206.4850, W-Loss: -1790.1974, GP: 58.3712, Acc: 1.0000
Step  300/597 - Gen Loss: 24225.2793, Critic Loss: -1808.0894, W-Loss: -2474.1719, GP: 66.6083, Acc: 1.0000
Step  400/597 - Gen Loss: 24412.5312, Critic Loss: -2001.7546, W-Loss: -2762.7766, GP: 76.1022, Acc: 1.0000
Step  500/597 - Gen Loss: 24558.8711, Critic Loss: -2381.6116, W-Loss: -3253.9851, GP: 87.2373, Acc: 0.9997
----------------------------------------------------------------------

Epoch 34
------------------------------------------------------------
Step    0/597 - Gen Loss: 36473.6875, Critic Loss: -17427.4180, W-Loss: -19515.4648, GP: 208.8044, Acc: 1.0000
Step  100/597 - Gen Loss: 28056.5352, Critic Loss: -5476.2935, W-Loss: -7471.8550, GP: 199.5555, Acc: 0.9979
Step  200/597 - Gen Loss: 27658.0449, Critic Loss: -5705.5215, W-Loss: -7883.3579, GP: 217.7822, Acc: 0.9970
Step  300/597 - Gen Loss: 27410.7441, Critic Loss: -6782.8496, W-Loss: -9253.1709, GP: 247.0307, Acc: 0.9971
Step  400/597 - Gen Loss: 27801.1914, Critic Loss: -7681.1489, W-Loss: -10427.3389, GP: 274.6183, Acc: 0.9973
Step  500/597 - Gen Loss: 28421.2578, Critic Loss: -8768.3037, W-Loss: -11817.9893, GP: 304.9685, Acc: 0.9972
----------------------------------------------------------------------

Epoch 35
------------------------------------------------------------
Step    0/597 - Gen Loss: 38431.9102, Critic Loss: -12722.0225, W-Loss: -16359.8779, GP: 363.7855, Acc: 1.0000
Step  100/597 - Gen Loss: 36308.4648, Critic Loss: -16176.4248, W-Loss: -22265.4082, GP: 608.8958, Acc: 0.9953
Step  200/597 - Gen Loss: 36513.7070, Critic Loss: -17433.9902, W-Loss: -23877.7324, GP: 644.3700, Acc: 0.9940
Step  300/597 - Gen Loss: 38000.5625, Critic Loss: -19402.6309, W-Loss: -26355.3145, GP: 695.2634, Acc: 0.9942
Step  400/597 - Gen Loss: 39684.1016, Critic Loss: -20586.8730, W-Loss: -28122.0645, GP: 753.5169, Acc: 0.9940
Step  500/597 - Gen Loss: 41191.8164, Critic Loss: -22222.6562, W-Loss: -30260.1777, GP: 803.7516, Acc: 0.9937
----------------------------------------------------------------------

Epoch 36
------------------------------------------------------------
Step    0/597 - Gen Loss: 59596.5273, Critic Loss: -50369.6992, W-Loss: -56933.7891, GP: 656.4088, Acc: 1.0000
Step  100/597 - Gen Loss: 57983.4219, Critic Loss: -35269.5156, W-Loss: -48090.9453, GP: 1282.1381, Acc: 0.9930
Step  200/597 - Gen Loss: 56740.6875, Critic Loss: -36523.5195, W-Loss: -49765.2891, GP: 1324.1738, Acc: 0.9922
Step  300/597 - Gen Loss: 56805.8281, Critic Loss: -38436.0352, W-Loss: -52526.6172, GP: 1409.0516, Acc: 0.9911
Step  400/597 - Gen Loss: 57662.1133, Critic Loss: -39191.2422, W-Loss: -54169.7852, GP: 1497.8552, Acc: 0.9916
Step  500/597 - Gen Loss: 57931.6094, Critic Loss: -40688.9883, W-Loss: -56495.5352, GP: 1580.6555, Acc: 0.9914
----------------------------------------------------------------------

Epoch 37
------------------------------------------------------------
Step    0/597 - Gen Loss: 76846.4062, Critic Loss: -71816.3594, W-Loss: -99746.4688, GP: 2793.0090, Acc: 0.9875
Step  100/597 - Gen Loss: 65899.3359, Critic Loss: -57365.7656, W-Loss: -78931.8828, GP: 2156.6045, Acc: 0.9912
Step  200/597 - Gen Loss: 65778.6406, Critic Loss: -57096.0195, W-Loss: -80175.5781, GP: 2307.9446, Acc: 0.9875
Step  300/597 - Gen Loss: 66998.5078, Critic Loss: -59456.3281, W-Loss: -83368.3906, GP: 2391.1902, Acc: 0.9875
Step  400/597 - Gen Loss: 68231.8125, Critic Loss: -61567.6328, W-Loss: -86501.4453, GP: 2493.3669, Acc: 0.9876
Step  500/597 - Gen Loss: 69948.2266, Critic Loss: -63742.2070, W-Loss: -89553.1016, GP: 2581.0754, Acc: 0.9872
----------------------------------------------------------------------

Epoch 38
------------------------------------------------------------
Step    0/597 - Gen Loss: 88108.7500, Critic Loss: -56261.9141, W-Loss: -99611.2266, GP: 4334.9307, Acc: 1.0000
Step  100/597 - Gen Loss: 87709.2188, Critic Loss: -81443.7656, W-Loss: -114317.6406, GP: 3287.3647, Acc: 0.9820
Step  200/597 - Gen Loss: 89172.6094, Critic Loss: -78768.1328, W-Loss: -112814.4141, GP: 3404.6072, Acc: 0.9815
Step  300/597 - Gen Loss: 90251.1641, Critic Loss: -81779.5469, W-Loss: -117048.8750, GP: 3526.9097, Acc: 0.9825
Step  400/597 - Gen Loss: 92664.5469, Critic Loss: -84686.3984, W-Loss: -121328.4531, GP: 3664.1907, Acc: 0.9825
Step  500/597 - Gen Loss: 94354.0312, Critic Loss: -85034.3516, W-Loss: -124298.4844, GP: 3926.3948, Acc: 0.9813
----------------------------------------------------------------------

Epoch 39
------------------------------------------------------------
Step    0/597 - Gen Loss: 128216.2344, Critic Loss: -114830.1875, W-Loss: -163178.0312, GP: 4834.7837, Acc: 1.0000
Step  100/597 - Gen Loss: 108501.2500, Critic Loss: -104055.2734, W-Loss: -152033.8438, GP: 4797.8398, Acc: 0.9764
Step  200/597 - Gen Loss: 109451.2500, Critic Loss: -107102.5547, W-Loss: -153764.8906, GP: 4666.2354, Acc: 0.9745
Step  300/597 - Gen Loss: 109346.3594, Critic Loss: -108112.8516, W-Loss: -156077.2656, GP: 4796.4434, Acc: 0.9746
Step  400/597 - Gen Loss: 113058.5859, Critic Loss: -110806.8906, W-Loss: -160282.4062, GP: 4947.5317, Acc: 0.9745
Step  500/597 - Gen Loss: 114125.3750, Critic Loss: -113617.8984, W-Loss: -165088.9531, GP: 5147.0830, Acc: 0.9745
----------------------------------------------------------------------

Epoch 40
------------------------------------------------------------
Step    0/597 - Gen Loss: 91336.1719, Critic Loss: -144873.3438, W-Loss: -170070.8750, GP: 2519.7515, Acc: 0.9969
Step  100/597 - Gen Loss: 135386.7188, Critic Loss: -135688.9375, W-Loss: -193215.5938, GP: 5752.6387, Acc: 0.9732
Step  200/597 - Gen Loss: 133538.2812, Critic Loss: -132227.4375, W-Loss: -191792.5469, GP: 5956.4858, Acc: 0.9723
Step  300/597 - Gen Loss: 131794.4375, Critic Loss: -135131.9531, W-Loss: -196181.1562, GP: 6104.8936, Acc: 0.9693
Step  400/597 - Gen Loss: 133300.3125, Critic Loss: -139235.1406, W-Loss: -201956.9062, GP: 6272.1543, Acc: 0.9677
Step  500/597 - Gen Loss: 136215.8438, Critic Loss: -141393.1406, W-Loss: -204851.4688, GP: 6345.8101, Acc: 0.9685
----------------------------------------------------------------------

Epoch 41
------------------------------------------------------------
Step    0/597 - Gen Loss: 177035.2812, Critic Loss: -186570.0938, W-Loss: -234745.0000, GP: 4817.4878, Acc: 0.9969
Step  100/597 - Gen Loss: 153041.7188, Critic Loss: -174835.1719, W-Loss: -246481.2031, GP: 7164.5713, Acc: 0.9620
Step  200/597 - Gen Loss: 152182.9375, Critic Loss: -169685.5000, W-Loss: -243068.1094, GP: 7338.2251, Acc: 0.9606
Step  300/597 - Gen Loss: 155302.0469, Critic Loss: -171750.0156, W-Loss: -248068.3125, GP: 7631.7754, Acc: 0.9595
Step  400/597 - Gen Loss: 158108.0781, Critic Loss: -175000.5625, W-Loss: -252996.8750, GP: 7799.5723, Acc: 0.9601
Step  500/597 - Gen Loss: 159627.2188, Critic Loss: -179533.0000, W-Loss: -258743.3906, GP: 7921.0117, Acc: 0.9593
----------------------------------------------------------------------

Epoch 42
------------------------------------------------------------
Step    0/597 - Gen Loss: 204975.1562, Critic Loss: -243908.4531, W-Loss: -338831.0938, GP: 9492.2646, Acc: 1.0000
Step  100/597 - Gen Loss: 169561.0625, Critic Loss: -205284.5156, W-Loss: -301748.2500, GP: 9646.3193, Acc: 0.9523
Step  200/597 - Gen Loss: 180741.0781, Critic Loss: -195415.3594, W-Loss: -298775.5312, GP: 10335.9570, Acc: 0.9491
Step  300/597 - Gen Loss: 174885.2500, Critic Loss: -196649.2656, W-Loss: -299352.2812, GP: 10270.2344, Acc: 0.9507
Step  400/597 - Gen Loss: 183406.1406, Critic Loss: -192257.0781, W-Loss: -293000.7812, GP: 10074.3076, Acc: 0.9500
Step  500/597 - Gen Loss: 184230.5156, Critic Loss: -198819.1406, W-Loss: -299818.9062, GP: 10099.8848, Acc: 0.9511
----------------------------------------------------------------------

Epoch 43
------------------------------------------------------------
Step    0/597 - Gen Loss: 258228.2656, Critic Loss: -123265.3594, W-Loss: -191752.9531, GP: 6848.7578, Acc: 1.0000
Step  100/597 - Gen Loss: 218830.7188, Critic Loss: -235074.0156, W-Loss: -342120.0312, GP: 10704.5205, Acc: 0.9384
Step  200/597 - Gen Loss: 218636.7031, Critic Loss: -224810.2344, W-Loss: -333354.0000, GP: 10854.2969, Acc: 0.9364
Step  300/597 - Gen Loss: 216499.0625, Critic Loss: -233624.1719, W-Loss: -344717.6250, GP: 11109.2363, Acc: 0.9365
Step  400/597 - Gen Loss: 216744.7812, Critic Loss: -240096.6406, W-Loss: -354028.4062, GP: 11393.0908, Acc: 0.9390
Step  500/597 - Gen Loss: 217046.2188, Critic Loss: -247114.1250, W-Loss: -361560.2500, GP: 11444.5342, Acc: 0.9399
----------------------------------------------------------------------

Epoch 44
------------------------------------------------------------
Step    0/597 - Gen Loss: 318626.5625, Critic Loss: -326193.7500, W-Loss: -428349.6875, GP: 10215.5986, Acc: 1.0000
Step  100/597 - Gen Loss: 235246.2500, Critic Loss: -273175.7188, W-Loss: -400616.8750, GP: 12743.9893, Acc: 0.9237
Step  200/597 - Gen Loss: 246604.5000, Critic Loss: -277244.9688, W-Loss: -405858.6562, GP: 12861.2461, Acc: 0.9221
Step  300/597 - Gen Loss: 249228.3594, Critic Loss: -281038.9375, W-Loss: -411449.1562, GP: 13040.8613, Acc: 0.9254
Step  400/597 - Gen Loss: 253020.2969, Critic Loss: -288480.4375, W-Loss: -420768.0312, GP: 13228.5615, Acc: 0.9239
Step  500/597 - Gen Loss: 256735.2344, Critic Loss: -284960.1562, W-Loss: -418468.8750, GP: 13350.7129, Acc: 0.9248
----------------------------------------------------------------------

Epoch 45
------------------------------------------------------------
Step    0/597 - Gen Loss: 334633.6875, Critic Loss: -365433.6562, W-Loss: -483013.0938, GP: 11757.9473, Acc: 1.0000
Step  100/597 - Gen Loss: 278770.3438, Critic Loss: -337903.5312, W-Loss: -483829.8438, GP: 14592.4287, Acc: 0.9099
Step  200/597 - Gen Loss: 275008.7500, Critic Loss: -311325.7188, W-Loss: -465231.9062, GP: 15390.4824, Acc: 0.9106
Step  300/597 - Gen Loss: 278685.0625, Critic Loss: -319268.3438, W-Loss: -474086.7500, GP: 15481.6729, Acc: 0.9143
Step  400/597 - Gen Loss: 285197.5312, Critic Loss: -324913.0625, W-Loss: -482599.7188, GP: 15768.4980, Acc: 0.9149
Step  500/597 - Gen Loss: 295563.3750, Critic Loss: -320383.1562, W-Loss: -479232.0625, GP: 15884.8027, Acc: 0.9159
----------------------------------------------------------------------

Epoch 46
------------------------------------------------------------
Step    0/597 - Gen Loss: 397643.0000, Critic Loss: -201566.0469, W-Loss: -445644.0938, GP: 24407.8008, Acc: 1.0000
Step  100/597 - Gen Loss: 322040.2500, Critic Loss: -366610.9375, W-Loss: -531245.3125, GP: 16463.2363, Acc: 0.9209
Step  200/597 - Gen Loss: 325780.5000, Critic Loss: -362135.1875, W-Loss: -527551.0625, GP: 16541.4609, Acc: 0.9289
Step  300/597 - Gen Loss: 322088.2500, Critic Loss: -360161.2188, W-Loss: -532142.5000, GP: 17197.9980, Acc: 0.9241
Step  400/597 - Gen Loss: 325232.0000, Critic Loss: -341496.0625, W-Loss: -516075.0938, GP: 17457.7891, Acc: 0.9153
Step  500/597 - Gen Loss: 328065.2188, Critic Loss: -344858.5000, W-Loss: -520722.2500, GP: 17586.3535, Acc: 0.9145
----------------------------------------------------------------------

Epoch 47
------------------------------------------------------------
Step    0/597 - Gen Loss: 463389.9688, Critic Loss: -485138.0625, W-Loss: -690172.5625, GP: 20503.4492, Acc: 0.9969
Step  100/597 - Gen Loss: 364716.8438, Critic Loss: -392012.8438, W-Loss: -584716.5000, GP: 19270.1602, Acc: 0.9014
Step  200/597 - Gen Loss: 370739.8750, Critic Loss: -353929.0938, W-Loss: -548226.5625, GP: 19429.5469, Acc: 0.9044
Step  300/597 - Gen Loss: 372818.2188, Critic Loss: -344278.6562, W-Loss: -533203.5000, GP: 18892.2422, Acc: 0.9010
Step  400/597 - Gen Loss: 372757.2812, Critic Loss: -350329.3438, W-Loss: -539887.5625, GP: 18955.5703, Acc: 0.9003
Step  500/597 - Gen Loss: 372480.3438, Critic Loss: -366465.1875, W-Loss: -557466.3125, GP: 19099.8164, Acc: 0.8990
----------------------------------------------------------------------

Epoch 48
------------------------------------------------------------
Step    0/597 - Gen Loss: 506033.6875, Critic Loss: -478419.0000, W-Loss: -681212.4375, GP: 20279.3418, Acc: 1.0000
Step  100/597 - Gen Loss: 396205.6250, Critic Loss: -457952.5625, W-Loss: -657486.7500, GP: 19953.1855, Acc: 0.8987
Step  200/597 - Gen Loss: 394974.2500, Critic Loss: -450017.1562, W-Loss: -655358.2500, GP: 20533.8047, Acc: 0.8957
Step  300/597 - Gen Loss: 389430.6562, Critic Loss: -439038.4688, W-Loss: -649781.6875, GP: 21074.0605, Acc: 0.8904
Step  400/597 - Gen Loss: 391188.9062, Critic Loss: -446317.0312, W-Loss: -659210.9375, GP: 21289.1445, Acc: 0.8937
Step  500/597 - Gen Loss: 394686.1250, Critic Loss: -447099.4062, W-Loss: -664240.8750, GP: 21713.8711, Acc: 0.8938
----------------------------------------------------------------------

Epoch 49
------------------------------------------------------------
Step    0/597 - Gen Loss: 439782.6562, Critic Loss: -784534.9375, W-Loss: -889073.5000, GP: 10453.8564, Acc: 1.0000
Step  100/597 - Gen Loss: 404360.7812, Critic Loss: -484866.6250, W-Loss: -714204.1875, GP: 22933.4355, Acc: 0.8826
Step  200/597 - Gen Loss: 417858.7500, Critic Loss: -475573.9688, W-Loss: -709044.5625, GP: 23346.7363, Acc: 0.8843
Step  300/597 - Gen Loss: 426702.8438, Critic Loss: -483148.3750, W-Loss: -721536.0625, GP: 23838.2930, Acc: 0.8815
Step  400/597 - Gen Loss: 430043.1875, Critic Loss: -490235.1875, W-Loss: -730323.5625, GP: 24008.3867, Acc: 0.8821
Step  500/597 - Gen Loss: 436194.1250, Critic Loss: -466769.0000, W-Loss: -710459.3125, GP: 24368.5645, Acc: 0.8759
----------------------------------------------------------------------

Epoch 50
------------------------------------------------------------
Step    0/597 - Gen Loss: 343802.8125, Critic Loss: -652384.7500, W-Loss: -879279.6250, GP: 22689.4805, Acc: 1.0000
Step  100/597 - Gen Loss: 446474.4688, Critic Loss: -555206.5000, W-Loss: -793300.2500, GP: 23809.0137, Acc: 0.8645
Step  200/597 - Gen Loss: 450908.2500, Critic Loss: -539709.6250, W-Loss: -790906.6250, GP: 25119.2012, Acc: 0.8574
Step  300/597 - Gen Loss: 452102.8125, Critic Loss: -543916.8125, W-Loss: -801672.2500, GP: 25775.0645, Acc: 0.8553
Step  400/597 - Gen Loss: 461958.5000, Critic Loss: -540601.6250, W-Loss: -801479.6250, GP: 26087.2754, Acc: 0.8594
Step  500/597 - Gen Loss: 473245.2500, Critic Loss: -533154.6250, W-Loss: -796866.5000, GP: 26370.7090, Acc: 0.8593

Generating Grid of Samples at Final Epoch 50:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
----------------------------------------------------------------------

Final Generated Samples - All Letter Classes:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
No description has been provided for this image

Observation:¶

  • Generated Samples: Many letters are poorly formed or unrecognizable; severe distortions and inconsistent stroke patterns across most classes.
  • Training Losses: Generator and critic losses fluctuate heavily, with extreme values → unstable training dynamics.
  • Discriminator Accuracy: Starts stable but declines after around 30 epochs, indicating weakening discriminator performance.
  • Loss Balance: Extremely large and growing gap suggests severe imbalance between generator and critic.

Enhanced WGAN implementation¶

This section implements an Enhanced WGAN with state-of-the-art improvements:

Enhanced Architecture:¶

  • Enhanced Generator: Residual blocks + Self-attention mechanism + Advanced upsampling
  • Enhanced Discriminator: Spectral normalization + Advanced downsampling + Improved stability
  • Balanced Training: Prevents discriminator dominance with multiple techniques

Training Balance Features:¶

  • Critic-to-Generator Ratio: Train critic more per step (e.g., n_critic = 5) to approximate Wasserstein distance.
  • Gradient Penalty: Use λ ≈ 10 to enforce 1-Lipschitz continuity (preferred over weight clipping).
  • Loss Balance: Keep critic and generator losses within reasonable range; avoid one dominating.
  • Learning Rates: Often equal (e.g., 1e-4) or slightly higher for critic.
  • Batch Size: 64–256 for stable Wasserstein estimates.
In [25]:
# =============================================================================
# ENHANCED WGAN TRAINING - 50 EPOCHS
# =============================================================================

# Test visualization first
test_images_enhanced, test_labels_enhanced = display_generated_samples_grid(
    enhanced_wgan_generator, class_to_letter, samples_per_class=6
)

# Training configuration
NUM_EPOCHS = 50

start_time = time.time()

for epoch in range(NUM_EPOCHS):
    epoch_start = time.time()
    
    # Train for one epoch with WGAN-GP strategy
    enhanced_wgan_trainer_new.train_epoch(train_dataset, epoch, steps_per_epoch)
    
    # Display only on the final epoch
    if (epoch + 1) == NUM_EPOCHS:
        print(f"\nGenerating Grid of Samples at Final Epoch {epoch + 1}:")
        display_generated_samples_grid(enhanced_wgan_generator, class_to_letter, epoch + 1, samples_per_class=6)
    
    # Calculate and display epoch timing
    epoch_time = time.time() - epoch_start
    total_time = time.time() - start_time
    avg_time = total_time / (epoch + 1)
    eta = avg_time * (NUM_EPOCHS - epoch - 1)

total_training_time = time.time() - start_time

# Generate final display
print(f"\nFinal Generated Samples - All Letter Classes:")
final_images_enhanced, final_labels_enhanced = display_generated_samples_grid(
    enhanced_wgan_generator, class_to_letter, NUM_EPOCHS, samples_per_class=6
)

# Plot training progress
if len(enhanced_wgan_trainer_new.history['gen_loss']) > 1:
    plt.figure(figsize=(15, 5))
    
    # Generator and Critic Loss
    plt.subplot(1, 3, 1)
    epochs = enhanced_wgan_trainer_new.history['epoch']
    plt.plot(epochs, enhanced_wgan_trainer_new.history['gen_loss'], label='Generator Loss', color='blue', linewidth=2)
    plt.plot(epochs, enhanced_wgan_trainer_new.history['critic_loss'], label='Critic Loss', color='red', linewidth=2)
    plt.title('Enhanced WGAN - Training Losses', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Critic Accuracy
    plt.subplot(1, 3, 2)
    plt.plot(epochs, enhanced_wgan_trainer_new.history['label_accuracy'], label='Critic Accuracy', color='green', linewidth=2)
    plt.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='Upper limit (95%)')
    plt.axhline(y=0.70, color='orange', linestyle='--', alpha=0.7, label='Lower limit (70%)')
    plt.title('Enhanced WGAN - Critic Accuracy', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Loss Difference
    plt.subplot(1, 3, 3)
    loss_diff = [g - c for g, c in zip(enhanced_wgan_trainer_new.history['gen_loss'], enhanced_wgan_trainer_new.history['critic_loss'])]
    plt.plot(epochs, loss_diff, label='Gen Loss - Critic Loss', color='purple', linewidth=2)
    plt.axhline(y=0, color='black', linestyle='-', alpha=0.5)
    plt.title('Enhanced WGAN - Loss Balance', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss Difference')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Epoch 1
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.7167, Critic Loss: 33.2300, W-Loss: -0.3786, GP: 2.9746, Acc: 0.0469
Step  100/597 - Gen Loss: 2.1010, Critic Loss: -1.2574, W-Loss: -4.7796, GP: 0.1146, Acc: 0.4529
Step  200/597 - Gen Loss: -2.6419, Critic Loss: -2.7091, W-Loss: -5.1462, GP: 0.1050, Acc: 0.6857
Step  300/597 - Gen Loss: -6.9175, Critic Loss: -2.8593, W-Loss: -4.8888, GP: 0.1028, Acc: 0.7784
Step  400/597 - Gen Loss: -10.0971, Critic Loss: -2.9392, W-Loss: -4.7881, GP: 0.1052, Acc: 0.8283
Step  500/597 - Gen Loss: -12.8246, Critic Loss: -3.3547, W-Loss: -5.2347, GP: 0.1212, Acc: 0.8597

Epoch 2
------------------------------------------------------------
Step    0/597 - Gen Loss: -86.6805, Critic Loss: -11.9089, W-Loss: -14.6030, GP: 0.2604, Acc: 0.9969
Step  100/597 - Gen Loss: -96.9808, Critic Loss: -40.4296, W-Loss: -60.0598, GP: 1.9190, Acc: 0.9693
Step  200/597 - Gen Loss: -142.4918, Critic Loss: -81.5796, W-Loss: -127.5368, GP: 4.5006, Acc: 0.9502
Step  300/597 - Gen Loss: -282.2530, Critic Loss: -229.5090, W-Loss: -376.2630, GP: 14.3835, Acc: 0.9112
Step  400/597 - Gen Loss: -598.0519, Critic Loss: -512.4686, W-Loss: -883.1915, GP: 36.3976, Acc: 0.8682
Step  500/597 - Gen Loss: -783.2605, Critic Loss: -595.1448, W-Loss: -1113.9314, GP: 51.1531, Acc: 0.8618

Epoch 3
------------------------------------------------------------
Step    0/597 - Gen Loss: -4670.4731, Critic Loss: -3178.0317, W-Loss: -5961.8901, GP: 276.4044, Acc: 0.7531
Step  100/597 - Gen Loss: -5981.6719, Critic Loss: -5057.1304, W-Loss: -9474.3662, GP: 437.6936, Acc: 0.6610
Step  200/597 - Gen Loss: -7234.5049, Critic Loss: -5677.2612, W-Loss: -10360.4980, GP: 464.7543, Acc: 0.6914
Step  300/597 - Gen Loss: -9914.6504, Critic Loss: -5243.4536, W-Loss: -9532.2227, GP: 425.3211, Acc: 0.7174
Step  400/597 - Gen Loss: -10958.1836, Critic Loss: -3941.6558, W-Loss: -7162.2427, GP: 319.2760, Acc: 0.7712
Step  500/597 - Gen Loss: -11550.4727, Critic Loss: -3146.6531, W-Loss: -5725.3496, GP: 255.5672, Acc: 0.8047

Epoch 4
------------------------------------------------------------
Step    0/597 - Gen Loss: -13662.2021, Critic Loss: 259.2082, W-Loss: 256.6541, GP: 0.1231, Acc: 0.9531
Step  100/597 - Gen Loss: -8812.6445, Critic Loss: 29.4775, W-Loss: 26.6047, GP: 0.0919, Acc: 0.9556
Step  200/597 - Gen Loss: -9353.0586, Critic Loss: 5.2813, W-Loss: 2.5771, GP: 0.0823, Acc: 0.9577
Step  300/597 - Gen Loss: -9547.2920, Critic Loss: 11.1499, W-Loss: 8.4841, GP: 0.0796, Acc: 0.9577
Step  400/597 - Gen Loss: -8963.0068, Critic Loss: 0.4747, W-Loss: -2.2500, GP: 0.0773, Acc: 0.9555
Step  500/597 - Gen Loss: -8974.3477, Critic Loss: 1.0067, W-Loss: -1.7939, GP: 0.0765, Acc: 0.9544

Epoch 5
------------------------------------------------------------
Step    0/597 - Gen Loss: -12382.9053, Critic Loss: -254.4980, W-Loss: -257.5486, GP: 0.0223, Acc: 0.9500
Step  100/597 - Gen Loss: -14304.0820, Critic Loss: -0.1485, W-Loss: -3.4812, GP: 0.0763, Acc: 0.9532
Step  200/597 - Gen Loss: -14587.1738, Critic Loss: -39.1140, W-Loss: -42.4118, GP: 0.0817, Acc: 0.9559
Step  300/597 - Gen Loss: -14103.5947, Critic Loss: -3.3762, W-Loss: -6.6839, GP: 0.0802, Acc: 0.9544
Step  400/597 - Gen Loss: -13609.6484, Critic Loss: 3.3627, W-Loss: 0.0784, GP: 0.0823, Acc: 0.9540
Step  500/597 - Gen Loss: -13508.3877, Critic Loss: 8.0946, W-Loss: 4.7907, GP: 0.0820, Acc: 0.9532

Epoch 6
------------------------------------------------------------
Step    0/597 - Gen Loss: -13617.5479, Critic Loss: 87.0745, W-Loss: 82.4039, GP: 0.0323, Acc: 0.9438
Step  100/597 - Gen Loss: -16997.3477, Critic Loss: -58.4535, W-Loss: -62.0896, GP: 0.0860, Acc: 0.9533
Step  200/597 - Gen Loss: -17406.6250, Critic Loss: -9.3158, W-Loss: -13.0053, GP: 0.0936, Acc: 0.9544
Step  300/597 - Gen Loss: -16888.5176, Critic Loss: 1.8309, W-Loss: -1.8050, GP: 0.0887, Acc: 0.9541
Step  400/597 - Gen Loss: -16689.1738, Critic Loss: 2.4837, W-Loss: -1.0685, GP: 0.0869, Acc: 0.9545
Step  500/597 - Gen Loss: -16938.2188, Critic Loss: -5.5750, W-Loss: -9.1461, GP: 0.0915, Acc: 0.9556

Epoch 7
------------------------------------------------------------
Step    0/597 - Gen Loss: -16458.8711, Critic Loss: 482.6991, W-Loss: 478.8926, GP: 0.0297, Acc: 0.9656
Step  100/597 - Gen Loss: -15824.9912, Critic Loss: -6.3358, W-Loss: -9.2502, GP: 0.0845, Acc: 0.9674
Step  200/597 - Gen Loss: -17470.1973, Critic Loss: 12.3859, W-Loss: 9.2963, GP: 0.0859, Acc: 0.9666
Step  300/597 - Gen Loss: -17899.1875, Critic Loss: 3.1828, W-Loss: 0.0500, GP: 0.0828, Acc: 0.9667
Step  400/597 - Gen Loss: -16825.0332, Critic Loss: -5.1505, W-Loss: -8.3023, GP: 0.0836, Acc: 0.9661
Step  500/597 - Gen Loss: -16628.6484, Critic Loss: 6.7888, W-Loss: 3.6121, GP: 0.0837, Acc: 0.9660

Epoch 8
------------------------------------------------------------
Step    0/597 - Gen Loss: -20672.5840, Critic Loss: -1300.8015, W-Loss: -1304.1031, GP: 0.0286, Acc: 0.9594
Step  100/597 - Gen Loss: -20168.1055, Critic Loss: -24.7424, W-Loss: -27.8713, GP: 0.0830, Acc: 0.9712
Step  200/597 - Gen Loss: -21731.3379, Critic Loss: -73.7769, W-Loss: -77.3980, GP: 0.1000, Acc: 0.9692
Step  300/597 - Gen Loss: -23760.8301, Critic Loss: -115.0002, W-Loss: -119.0170, GP: 0.1061, Acc: 0.9673
Step  400/597 - Gen Loss: -25101.9941, Critic Loss: -40.7205, W-Loss: -44.8265, GP: 0.1046, Acc: 0.9667
Step  500/597 - Gen Loss: -25894.0332, Critic Loss: -52.7869, W-Loss: -56.8180, GP: 0.1049, Acc: 0.9675

Epoch 9
------------------------------------------------------------
Step    0/597 - Gen Loss: -30586.1133, Critic Loss: -629.5369, W-Loss: -635.8043, GP: 0.0338, Acc: 0.9563
Step  100/597 - Gen Loss: -28303.3555, Critic Loss: -44.0371, W-Loss: -49.0650, GP: 0.1489, Acc: 0.9666
Step  200/597 - Gen Loss: -28272.6719, Critic Loss: -89.3060, W-Loss: -94.2887, GP: 0.1418, Acc: 0.9668
Step  300/597 - Gen Loss: -28131.9238, Critic Loss: -88.8883, W-Loss: -93.8081, GP: 0.1397, Acc: 0.9681
Step  400/597 - Gen Loss: -27832.1094, Critic Loss: -48.8302, W-Loss: -53.5618, GP: 0.1360, Acc: 0.9685
Step  500/597 - Gen Loss: -27554.7285, Critic Loss: -39.1474, W-Loss: -43.8643, GP: 0.1364, Acc: 0.9685

Epoch 10
------------------------------------------------------------
Step    0/597 - Gen Loss: -25300.8398, Critic Loss: 420.3851, W-Loss: 417.4316, GP: 0.0363, Acc: 0.9750
Step  100/597 - Gen Loss: -23265.5645, Critic Loss: 117.0003, W-Loss: 112.6757, GP: 0.1052, Acc: 0.9690
Step  200/597 - Gen Loss: -20707.7715, Critic Loss: 72.5485, W-Loss: 68.3112, GP: 0.1033, Acc: 0.9700
Step  300/597 - Gen Loss: -18848.5059, Critic Loss: 84.9589, W-Loss: 80.7291, GP: 0.1021, Acc: 0.9702
Step  400/597 - Gen Loss: -17387.1523, Critic Loss: 76.6238, W-Loss: 72.5170, GP: 0.0988, Acc: 0.9705
Step  500/597 - Gen Loss: -17573.3105, Critic Loss: 28.0955, W-Loss: 24.0432, GP: 0.1000, Acc: 0.9711

Epoch 11
------------------------------------------------------------
Step    0/597 - Gen Loss: -23380.4512, Critic Loss: -2075.6108, W-Loss: -2078.4031, GP: 0.0385, Acc: 0.9844
Step  100/597 - Gen Loss: -24352.6016, Critic Loss: 28.4778, W-Loss: 23.9502, GP: 0.1234, Acc: 0.9741
Step  200/597 - Gen Loss: -25787.3281, Critic Loss: 2.6560, W-Loss: -1.7781, GP: 0.1273, Acc: 0.9759
Step  300/597 - Gen Loss: -25109.1387, Critic Loss: -13.9514, W-Loss: -18.1882, GP: 0.1251, Acc: 0.9764
Step  400/597 - Gen Loss: -23271.3066, Critic Loss: 28.8100, W-Loss: 24.8011, GP: 0.1193, Acc: 0.9769
Step  500/597 - Gen Loss: -23177.2695, Critic Loss: -15.3663, W-Loss: -19.5541, GP: 0.1186, Acc: 0.9762

Epoch 12
------------------------------------------------------------
Step    0/597 - Gen Loss: -23673.5957, Critic Loss: 226.2976, W-Loss: 224.3293, GP: 0.1323, Acc: 0.9875
Step  100/597 - Gen Loss: -26340.5684, Critic Loss: -81.8351, W-Loss: -85.5434, GP: 0.1240, Acc: 0.9808
Step  200/597 - Gen Loss: -28308.2344, Critic Loss: -63.0673, W-Loss: -66.9523, GP: 0.1311, Acc: 0.9808
Step  300/597 - Gen Loss: -29418.4824, Critic Loss: -101.4678, W-Loss: -105.4203, GP: 0.1290, Acc: 0.9806
Step  400/597 - Gen Loss: -30519.2891, Critic Loss: -16.7718, W-Loss: -20.9300, GP: 0.1293, Acc: 0.9797
Step  500/597 - Gen Loss: -31309.6250, Critic Loss: -33.6888, W-Loss: -38.1291, GP: 0.1275, Acc: 0.9780

Epoch 13
------------------------------------------------------------
Step    0/597 - Gen Loss: -38828.1250, Critic Loss: -796.6964, W-Loss: -801.9547, GP: 0.3665, Acc: 0.9875
Step  100/597 - Gen Loss: -38111.6094, Critic Loss: -6.7531, W-Loss: -12.0202, GP: 0.1528, Acc: 0.9759
Step  200/597 - Gen Loss: -41307.6094, Critic Loss: -27.7512, W-Loss: -33.0746, GP: 0.1583, Acc: 0.9763
Step  300/597 - Gen Loss: -42367.4688, Critic Loss: -75.4710, W-Loss: -80.7417, GP: 0.1597, Acc: 0.9770
Step  400/597 - Gen Loss: -42121.2031, Critic Loss: -5.5436, W-Loss: -10.7188, GP: 0.1485, Acc: 0.9767
Step  500/597 - Gen Loss: -40699.9844, Critic Loss: 23.0354, W-Loss: 17.8646, GP: 0.1438, Acc: 0.9765

Epoch 14
------------------------------------------------------------
Step    0/597 - Gen Loss: -38517.5508, Critic Loss: -1927.0022, W-Loss: -1934.7360, GP: 0.5617, Acc: 0.9906
Step  100/597 - Gen Loss: -37307.4727, Critic Loss: -101.6977, W-Loss: -106.7268, GP: 0.1623, Acc: 0.9798
Step  200/597 - Gen Loss: -37217.1797, Critic Loss: 45.1362, W-Loss: 40.4191, GP: 0.1464, Acc: 0.9806
Step  300/597 - Gen Loss: -37901.3594, Critic Loss: -68.5524, W-Loss: -73.2436, GP: 0.1464, Acc: 0.9804
Step  400/597 - Gen Loss: -37003.3555, Critic Loss: 6.3530, W-Loss: 1.6252, GP: 0.1419, Acc: 0.9801
Step  500/597 - Gen Loss: -36803.6406, Critic Loss: -27.5981, W-Loss: -32.3026, GP: 0.1409, Acc: 0.9800

Epoch 15
------------------------------------------------------------
Step    0/597 - Gen Loss: -41720.1953, Critic Loss: 192.4202, W-Loss: 187.9820, GP: 0.0441, Acc: 0.9781
Step  100/597 - Gen Loss: -47916.4648, Critic Loss: -465.1123, W-Loss: -470.7170, GP: 0.1599, Acc: 0.9795
Step  200/597 - Gen Loss: -49636.6797, Critic Loss: -7.5724, W-Loss: -13.1388, GP: 0.1614, Acc: 0.9802
Step  300/597 - Gen Loss: -49609.5625, Critic Loss: -194.9079, W-Loss: -200.2851, GP: 0.1633, Acc: 0.9808
Step  400/597 - Gen Loss: -50584.3594, Critic Loss: -87.5600, W-Loss: -92.8462, GP: 0.1631, Acc: 0.9812
Step  500/597 - Gen Loss: -50534.5547, Critic Loss: -42.0485, W-Loss: -47.2682, GP: 0.1640, Acc: 0.9814

Epoch 16
------------------------------------------------------------
Step    0/597 - Gen Loss: -50350.1875, Critic Loss: -2571.9956, W-Loss: -2577.5750, GP: 0.0462, Acc: 0.9812
Step  100/597 - Gen Loss: -56696.1250, Critic Loss: -346.8472, W-Loss: -352.8972, GP: 0.2123, Acc: 0.9814
Step  200/597 - Gen Loss: -52856.5234, Critic Loss: -94.5564, W-Loss: -100.6388, GP: 0.2042, Acc: 0.9820
Step  300/597 - Gen Loss: -52640.7305, Critic Loss: -247.2421, W-Loss: -253.2995, GP: 0.2039, Acc: 0.9824
Step  400/597 - Gen Loss: -54293.1406, Critic Loss: -155.7221, W-Loss: -161.9988, GP: 0.2075, Acc: 0.9812
Step  500/597 - Gen Loss: -55089.5234, Critic Loss: -89.4649, W-Loss: -95.8331, GP: 0.2036, Acc: 0.9806

Epoch 17
------------------------------------------------------------
Step    0/597 - Gen Loss: -57243.0742, Critic Loss: 3017.8516, W-Loss: 3015.0891, GP: 0.0440, Acc: 0.9844
Step  100/597 - Gen Loss: -57377.1133, Critic Loss: 564.8786, W-Loss: 557.8820, GP: 0.2286, Acc: 0.9797
Step  200/597 - Gen Loss: -57328.1094, Critic Loss: 60.6666, W-Loss: 53.6368, GP: 0.2234, Acc: 0.9794
Step  300/597 - Gen Loss: -56756.7656, Critic Loss: -69.6664, W-Loss: -76.8506, GP: 0.2206, Acc: 0.9788
Step  400/597 - Gen Loss: -57798.3828, Critic Loss: -87.6272, W-Loss: -94.9635, GP: 0.2230, Acc: 0.9787
Step  500/597 - Gen Loss: -58888.8477, Critic Loss: -22.2883, W-Loss: -29.5428, GP: 0.2211, Acc: 0.9789

Epoch 18
------------------------------------------------------------
Step    0/597 - Gen Loss: -63384.4727, Critic Loss: -3100.4421, W-Loss: -3102.4507, GP: 0.0357, Acc: 0.9781
Step  100/597 - Gen Loss: -59800.5156, Critic Loss: 157.6357, W-Loss: 151.0717, GP: 0.2230, Acc: 0.9817
Step  200/597 - Gen Loss: -61884.6719, Critic Loss: 40.1487, W-Loss: 33.0402, GP: 0.2429, Acc: 0.9807
Step  300/597 - Gen Loss: -63034.1133, Critic Loss: -349.7399, W-Loss: -357.2279, GP: 0.2549, Acc: 0.9806
Step  400/597 - Gen Loss: -64586.0000, Critic Loss: -134.8826, W-Loss: -142.7921, GP: 0.2564, Acc: 0.9796
Step  500/597 - Gen Loss: -66164.6250, Critic Loss: -172.4624, W-Loss: -180.5146, GP: 0.2556, Acc: 0.9791

Epoch 19
------------------------------------------------------------
Step    0/597 - Gen Loss: -74807.9531, Critic Loss: 3814.1128, W-Loss: 3808.1797, GP: 0.0591, Acc: 0.9906
Step  100/597 - Gen Loss: -82609.3281, Critic Loss: 126.0955, W-Loss: 117.5596, GP: 0.3024, Acc: 0.9821
Step  200/597 - Gen Loss: -81496.3359, Critic Loss: 68.5711, W-Loss: 60.1915, GP: 0.2841, Acc: 0.9811
Step  300/597 - Gen Loss: -78903.3594, Critic Loss: -311.6729, W-Loss: -319.7769, GP: 0.2847, Acc: 0.9817
Step  400/597 - Gen Loss: -78170.3828, Critic Loss: -27.8845, W-Loss: -35.7989, GP: 0.2834, Acc: 0.9819
Step  500/597 - Gen Loss: -76587.5156, Critic Loss: -82.3007, W-Loss: -90.1759, GP: 0.2853, Acc: 0.9819

Epoch 20
------------------------------------------------------------
Step    0/597 - Gen Loss: -75609.6094, Critic Loss: -15023.5107, W-Loss: -15030.4971, GP: 0.1101, Acc: 0.9875
Step  100/597 - Gen Loss: -67420.8438, Critic Loss: 131.5867, W-Loss: 121.9563, GP: 0.3089, Acc: 0.9776
Step  200/597 - Gen Loss: -66191.2109, Critic Loss: 198.2640, W-Loss: 189.0795, GP: 0.3284, Acc: 0.9794
Step  300/597 - Gen Loss: -60826.3438, Critic Loss: -9.1371, W-Loss: -17.8807, GP: 0.3195, Acc: 0.9800
Step  400/597 - Gen Loss: -57690.7891, Critic Loss: 243.9167, W-Loss: 235.5121, GP: 0.3077, Acc: 0.9803
Step  500/597 - Gen Loss: -55511.6602, Critic Loss: 55.5079, W-Loss: 47.2322, GP: 0.3027, Acc: 0.9805

Epoch 21
------------------------------------------------------------
Step    0/597 - Gen Loss: -76004.3125, Critic Loss: -10649.7129, W-Loss: -10658.1660, GP: 0.3175, Acc: 0.9844
Step  100/597 - Gen Loss: -68918.1094, Critic Loss: 147.8934, W-Loss: 140.3162, GP: 0.3162, Acc: 0.9844
Step  200/597 - Gen Loss: -69190.8281, Critic Loss: -90.1557, W-Loss: -98.1506, GP: 0.3382, Acc: 0.9841
Step  300/597 - Gen Loss: -69091.1328, Critic Loss: -271.9482, W-Loss: -279.9930, GP: 0.3211, Acc: 0.9832
Step  400/597 - Gen Loss: -71764.8750, Critic Loss: -182.3465, W-Loss: -190.4660, GP: 0.3180, Acc: 0.9831
Step  500/597 - Gen Loss: -73353.6953, Critic Loss: -315.2722, W-Loss: -323.5867, GP: 0.3150, Acc: 0.9826

Epoch 22
------------------------------------------------------------
Step    0/597 - Gen Loss: -102564.7422, Critic Loss: 16175.6621, W-Loss: 16159.6172, GP: 0.7151, Acc: 0.9781
Step  100/597 - Gen Loss: -90823.5625, Critic Loss: 386.5254, W-Loss: 375.8613, GP: 0.3647, Acc: 0.9804
Step  200/597 - Gen Loss: -93782.5000, Critic Loss: -206.1927, W-Loss: -217.4782, GP: 0.3970, Acc: 0.9797
Step  300/597 - Gen Loss: -91082.0938, Critic Loss: 1.8338, W-Loss: -9.1041, GP: 0.4103, Acc: 0.9803
Step  400/597 - Gen Loss: -87120.9844, Critic Loss: 135.1419, W-Loss: 124.3412, GP: 0.4022, Acc: 0.9803
Step  500/597 - Gen Loss: -85506.4844, Critic Loss: 29.3554, W-Loss: 18.5799, GP: 0.3879, Acc: 0.9795

Epoch 23
------------------------------------------------------------
Step    0/597 - Gen Loss: -92664.0703, Critic Loss: -3243.7290, W-Loss: -3247.6765, GP: 0.0531, Acc: 0.9844
Step  100/597 - Gen Loss: -102062.5156, Critic Loss: 5.5750, W-Loss: -4.1771, GP: 0.2778, Acc: 0.9811
Step  200/597 - Gen Loss: -93933.4609, Critic Loss: 551.4724, W-Loss: 541.9943, GP: 0.2971, Acc: 0.9813
Step  300/597 - Gen Loss: -91223.3594, Critic Loss: -329.9706, W-Loss: -339.6768, GP: 0.3093, Acc: 0.9809
Step  400/597 - Gen Loss: -90839.4688, Critic Loss: 73.0164, W-Loss: 63.4503, GP: 0.2978, Acc: 0.9809
Step  500/597 - Gen Loss: -91440.9609, Critic Loss: -67.2012, W-Loss: -76.8337, GP: 0.3064, Acc: 0.9809

Epoch 24
------------------------------------------------------------
Step    0/597 - Gen Loss: -109458.6484, Critic Loss: 11880.2051, W-Loss: 11873.5566, GP: 0.1313, Acc: 0.9844
Step  100/597 - Gen Loss: -123289.2891, Critic Loss: -281.2260, W-Loss: -293.4010, GP: 0.3286, Acc: 0.9768
Step  200/597 - Gen Loss: -127626.9062, Critic Loss: -291.0522, W-Loss: -303.5034, GP: 0.3445, Acc: 0.9781
Step  300/597 - Gen Loss: -126426.0703, Critic Loss: -1106.4806, W-Loss: -1118.7548, GP: 0.3594, Acc: 0.9785
Step  400/597 - Gen Loss: -130278.0469, Critic Loss: -475.0466, W-Loss: -487.5669, GP: 0.3750, Acc: 0.9783
Step  500/597 - Gen Loss: -133265.4844, Critic Loss: -710.1071, W-Loss: -722.9640, GP: 0.3917, Acc: 0.9782

Epoch 25
------------------------------------------------------------
Step    0/597 - Gen Loss: -144947.4688, Critic Loss: 9895.6943, W-Loss: 9879.7998, GP: 0.0795, Acc: 0.9781
Step  100/597 - Gen Loss: -157964.4375, Critic Loss: 12.8840, W-Loss: -5.3693, GP: 0.5236, Acc: 0.9735
Step  200/597 - Gen Loss: -171242.9375, Critic Loss: 643.7300, W-Loss: 624.9692, GP: 0.4979, Acc: 0.9731
Step  300/597 - Gen Loss: -168619.3281, Critic Loss: -1147.8461, W-Loss: -1166.6741, GP: 0.4992, Acc: 0.9728
Step  400/597 - Gen Loss: -169376.2344, Critic Loss: -1008.9179, W-Loss: -1028.1578, GP: 0.5030, Acc: 0.9720
Step  500/597 - Gen Loss: -176080.2969, Critic Loss: -1035.0625, W-Loss: -1054.5791, GP: 0.5033, Acc: 0.9714

Epoch 26
------------------------------------------------------------
Step    0/597 - Gen Loss: -183721.8125, Critic Loss: 1951.7078, W-Loss: 1904.0750, GP: 2.4617, Acc: 0.9594
Step  100/597 - Gen Loss: -180795.5625, Critic Loss: 701.2108, W-Loss: 676.5483, GP: 0.6097, Acc: 0.9680
Step  200/597 - Gen Loss: -166383.4062, Critic Loss: 1327.6206, W-Loss: 1304.1063, GP: 0.5726, Acc: 0.9681
Step  300/597 - Gen Loss: -162378.5469, Critic Loss: 81.2755, W-Loss: 59.0144, GP: 0.5470, Acc: 0.9692
Step  400/597 - Gen Loss: -163864.6406, Critic Loss: 586.7202, W-Loss: 565.3774, GP: 0.5363, Acc: 0.9705
Step  500/597 - Gen Loss: -167832.5625, Critic Loss: -35.8559, W-Loss: -56.4861, GP: 0.5214, Acc: 0.9711

Epoch 27
------------------------------------------------------------
Step    0/597 - Gen Loss: -183116.0156, Critic Loss: -22219.5000, W-Loss: -22241.9062, GP: 0.1554, Acc: 0.9656
Step  100/597 - Gen Loss: -256930.3438, Critic Loss: 2264.1482, W-Loss: 2234.6252, GP: 0.8470, Acc: 0.9672
Step  200/597 - Gen Loss: -247217.0781, Critic Loss: 1975.4230, W-Loss: 1949.3900, GP: 0.7986, Acc: 0.9703
Step  300/597 - Gen Loss: -234745.7812, Critic Loss: 576.3001, W-Loss: 552.3591, GP: 0.7447, Acc: 0.9713
Step  400/597 - Gen Loss: -235862.1250, Critic Loss: 461.7574, W-Loss: 438.2850, GP: 0.7528, Acc: 0.9717
Step  500/597 - Gen Loss: -239211.0156, Critic Loss: 144.4145, W-Loss: 121.2388, GP: 0.7386, Acc: 0.9719

Epoch 28
------------------------------------------------------------
Step    0/597 - Gen Loss: -251425.9531, Critic Loss: -53841.3516, W-Loss: -53861.4141, GP: 0.4671, Acc: 0.9625
Step  100/597 - Gen Loss: -272933.0312, Critic Loss: -602.4647, W-Loss: -633.7369, GP: 0.8976, Acc: 0.9670
Step  200/597 - Gen Loss: -268260.3125, Critic Loss: 627.8726, W-Loss: 597.3129, GP: 0.8862, Acc: 0.9674
Step  300/597 - Gen Loss: -262896.5625, Critic Loss: -1689.8218, W-Loss: -1719.1450, GP: 0.8842, Acc: 0.9686
Step  400/597 - Gen Loss: -258442.5156, Critic Loss: -868.4567, W-Loss: -897.6013, GP: 0.9233, Acc: 0.9690
Step  500/597 - Gen Loss: -255824.0156, Critic Loss: -1612.6184, W-Loss: -1641.9011, GP: 0.9482, Acc: 0.9693

Epoch 29
------------------------------------------------------------
Step    0/597 - Gen Loss: -294096.1875, Critic Loss: -18102.8477, W-Loss: -18125.6504, GP: 0.0923, Acc: 0.9688
Step  100/597 - Gen Loss: -330186.5000, Critic Loss: -3905.9971, W-Loss: -3951.2349, GP: 1.5572, Acc: 0.9626
Step  200/597 - Gen Loss: -317174.3438, Critic Loss: -3503.5686, W-Loss: -3548.1201, GP: 1.6199, Acc: 0.9632
Step  300/597 - Gen Loss: -313829.9688, Critic Loss: -4159.4902, W-Loss: -4204.1040, GP: 1.6017, Acc: 0.9630
Step  400/597 - Gen Loss: -325780.8750, Critic Loss: -2667.8879, W-Loss: -2712.8035, GP: 1.5843, Acc: 0.9625
Step  500/597 - Gen Loss: -337403.8438, Critic Loss: -3076.8047, W-Loss: -3122.1243, GP: 1.5705, Acc: 0.9619

Epoch 30
------------------------------------------------------------
Step    0/597 - Gen Loss: -428664.4688, Critic Loss: -31740.1367, W-Loss: -31786.1680, GP: 0.1127, Acc: 0.9500
Step  100/597 - Gen Loss: -449782.5625, Critic Loss: -800.6113, W-Loss: -861.1158, GP: 1.9668, Acc: 0.9523
Step  200/597 - Gen Loss: -473022.8750, Critic Loss: 1075.6787, W-Loss: 1016.3713, GP: 1.8941, Acc: 0.9520
Step  300/597 - Gen Loss: -471442.5000, Critic Loss: -1247.7964, W-Loss: -1304.4181, GP: 1.7973, Acc: 0.9539
Step  400/597 - Gen Loss: -466272.6250, Critic Loss: 976.9465, W-Loss: 922.9463, GP: 1.7475, Acc: 0.9556
Step  500/597 - Gen Loss: -459806.8438, Critic Loss: -362.6032, W-Loss: -414.1882, GP: 1.6645, Acc: 0.9572

Epoch 31
------------------------------------------------------------
Step    0/597 - Gen Loss: -475847.8750, Critic Loss: -45586.6797, W-Loss: -45627.9766, GP: 0.1219, Acc: 0.9656
Step  100/597 - Gen Loss: -494646.7188, Critic Loss: 3815.5813, W-Loss: 3755.9443, GP: 1.7410, Acc: 0.9563
Step  200/597 - Gen Loss: -478431.4375, Critic Loss: 2061.6167, W-Loss: 2005.7374, GP: 1.7543, Acc: 0.9580
Step  300/597 - Gen Loss: -477491.0312, Critic Loss: -2509.7810, W-Loss: -2563.5820, GP: 1.7261, Acc: 0.9584
Step  400/597 - Gen Loss: -489856.6875, Critic Loss: -596.2393, W-Loss: -650.3376, GP: 1.7105, Acc: 0.9582
Step  500/597 - Gen Loss: -506996.7188, Critic Loss: -1237.0322, W-Loss: -1291.4202, GP: 1.6795, Acc: 0.9577

Epoch 32
------------------------------------------------------------
Step    0/597 - Gen Loss: -622965.1875, Critic Loss: 18184.7363, W-Loss: 18109.7617, GP: 3.6973, Acc: 0.9656
Step  100/597 - Gen Loss: -561139.8125, Critic Loss: -17.4066, W-Loss: -73.8957, GP: 1.8082, Acc: 0.9574
Step  200/597 - Gen Loss: -558194.1875, Critic Loss: 2090.9856, W-Loss: 2035.7196, GP: 1.7581, Acc: 0.9574
Step  300/597 - Gen Loss: -540324.1250, Critic Loss: -1950.4199, W-Loss: -2005.3887, GP: 1.8076, Acc: 0.9580
Step  400/597 - Gen Loss: -545837.0625, Critic Loss: 463.4841, W-Loss: 408.5214, GP: 1.7960, Acc: 0.9583
Step  500/597 - Gen Loss: -550958.0625, Critic Loss: -624.1193, W-Loss: -677.8334, GP: 1.7682, Acc: 0.9592

Epoch 33
------------------------------------------------------------
Step    0/597 - Gen Loss: -635976.9375, Critic Loss: 50721.0039, W-Loss: 50663.2109, GP: 0.4600, Acc: 0.9406
Step  100/597 - Gen Loss: -584246.3750, Critic Loss: -4643.0977, W-Loss: -4718.1997, GP: 2.4196, Acc: 0.9523
Step  200/597 - Gen Loss: -554619.1250, Critic Loss: 1099.8219, W-Loss: 1028.8320, GP: 2.4397, Acc: 0.9549
Step  300/597 - Gen Loss: -538190.6875, Critic Loss: -2895.8811, W-Loss: -2965.1895, GP: 2.4788, Acc: 0.9569
Step  400/597 - Gen Loss: -562300.4375, Critic Loss: -1652.8669, W-Loss: -1724.0557, GP: 2.5195, Acc: 0.9563
Step  500/597 - Gen Loss: -580082.8125, Critic Loss: -3084.8843, W-Loss: -3155.9131, GP: 2.5115, Acc: 0.9563

Epoch 34
------------------------------------------------------------
Step    0/597 - Gen Loss: -560130.3750, Critic Loss: -76268.0469, W-Loss: -76338.0781, GP: 1.0219, Acc: 0.9469
Step  100/597 - Gen Loss: -756952.8750, Critic Loss: 1824.7418, W-Loss: 1720.5886, GP: 3.2420, Acc: 0.9408
Step  200/597 - Gen Loss: -767875.7500, Critic Loss: 1789.6130, W-Loss: 1693.2338, GP: 3.0387, Acc: 0.9441
Step  300/597 - Gen Loss: -752344.7500, Critic Loss: -2449.9126, W-Loss: -2542.4805, GP: 3.0356, Acc: 0.9465
Step  400/597 - Gen Loss: -765746.1250, Critic Loss: -2101.0754, W-Loss: -2193.3689, GP: 3.0170, Acc: 0.9465
Step  500/597 - Gen Loss: -777772.1875, Critic Loss: -855.2665, W-Loss: -947.0332, GP: 3.0191, Acc: 0.9466

Epoch 35
------------------------------------------------------------
Step    0/597 - Gen Loss: -773469.6875, Critic Loss: 10933.4785, W-Loss: 10708.1621, GP: 16.2524, Acc: 0.9531
Step  100/597 - Gen Loss: -767088.6250, Critic Loss: 4319.9985, W-Loss: 4217.9517, GP: 3.5273, Acc: 0.9469
Step  200/597 - Gen Loss: -787969.5000, Critic Loss: 7633.7275, W-Loss: 7536.8560, GP: 3.3202, Acc: 0.9476
Step  300/597 - Gen Loss: -778988.3125, Critic Loss: -853.4112, W-Loss: -946.2977, GP: 3.2504, Acc: 0.9490
Step  400/597 - Gen Loss: -802210.6875, Critic Loss: 335.1146, W-Loss: 241.6962, GP: 3.1477, Acc: 0.9480
Step  500/597 - Gen Loss: -812596.7500, Critic Loss: -662.6339, W-Loss: -754.9269, GP: 3.0696, Acc: 0.9489

Epoch 36
------------------------------------------------------------
Step    0/597 - Gen Loss: -793764.2500, Critic Loss: 50824.8711, W-Loss: 50617.6875, GP: 11.9495, Acc: 0.9187
Step  100/597 - Gen Loss: -870761.5000, Critic Loss: -8679.8174, W-Loss: -8801.0322, GP: 3.8415, Acc: 0.9390
Step  200/597 - Gen Loss: -866224.3125, Critic Loss: 4448.7109, W-Loss: 4340.9746, GP: 3.4119, Acc: 0.9427
Step  300/597 - Gen Loss: -845012.1875, Critic Loss: 497.8233, W-Loss: 396.7625, GP: 3.4135, Acc: 0.9458
Step  400/597 - Gen Loss: -846175.8125, Critic Loss: 80.4248, W-Loss: -20.8514, GP: 3.4223, Acc: 0.9462
Step  500/597 - Gen Loss: -850639.1875, Critic Loss: -2978.5813, W-Loss: -3080.7312, GP: 3.4415, Acc: 0.9461

Epoch 37
------------------------------------------------------------
Step    0/597 - Gen Loss: -784277.8750, Critic Loss: -158035.8750, W-Loss: -158213.7500, GP: 11.6397, Acc: 0.9656
Step  100/597 - Gen Loss: -1005912.3125, Critic Loss: -7435.5464, W-Loss: -7575.7944, GP: 4.2841, Acc: 0.9378
Step  200/597 - Gen Loss: -1055176.2500, Critic Loss: -3880.8660, W-Loss: -4019.9836, GP: 4.2886, Acc: 0.9373
Step  300/597 - Gen Loss: -1036261.1250, Critic Loss: -6892.7012, W-Loss: -7023.3813, GP: 4.0284, Acc: 0.9388
Step  400/597 - Gen Loss: -1031650.8750, Critic Loss: -3419.1316, W-Loss: -3546.6804, GP: 3.9875, Acc: 0.9402
Step  500/597 - Gen Loss: -1026001.3750, Critic Loss: -6373.2573, W-Loss: -6500.6934, GP: 4.0342, Acc: 0.9402

Epoch 38
------------------------------------------------------------
Step    0/597 - Gen Loss: -756945.1250, Critic Loss: -25983.6152, W-Loss: -26051.7500, GP: 0.2133, Acc: 0.9531
Step  100/597 - Gen Loss: -1122305.2500, Critic Loss: -12005.1562, W-Loss: -12157.9238, GP: 4.4370, Acc: 0.9319
Step  200/597 - Gen Loss: -1143930.6250, Critic Loss: -1891.4482, W-Loss: -2046.4236, GP: 4.7909, Acc: 0.9324
Step  300/597 - Gen Loss: -1098878.8750, Critic Loss: -7292.5376, W-Loss: -7442.9707, GP: 4.8969, Acc: 0.9352
Step  400/597 - Gen Loss: -1081516.8750, Critic Loss: -5152.9414, W-Loss: -5299.5029, GP: 4.7373, Acc: 0.9358
Step  500/597 - Gen Loss: -1071030.0000, Critic Loss: -6369.1675, W-Loss: -6514.7212, GP: 4.7642, Acc: 0.9361

Epoch 39
------------------------------------------------------------
Step    0/597 - Gen Loss: -1023746.9375, Critic Loss: 18848.8633, W-Loss: 18759.5996, GP: 0.1926, Acc: 0.9438
Step  100/597 - Gen Loss: -1236544.3750, Critic Loss: -4313.1479, W-Loss: -4499.8213, GP: 5.7056, Acc: 0.9273
Step  200/597 - Gen Loss: -1269786.1250, Critic Loss: 6900.1528, W-Loss: 6715.3306, GP: 5.7684, Acc: 0.9269
Step  300/597 - Gen Loss: -1257984.0000, Critic Loss: 283.9683, W-Loss: 113.2171, GP: 5.3049, Acc: 0.9298
Step  400/597 - Gen Loss: -1244893.6250, Critic Loss: 2756.3755, W-Loss: 2590.2239, GP: 5.2215, Acc: 0.9308
Step  500/597 - Gen Loss: -1244310.0000, Critic Loss: 1068.3558, W-Loss: 906.8563, GP: 5.1398, Acc: 0.9320

Epoch 40
------------------------------------------------------------
Step    0/597 - Gen Loss: -1424203.5000, Critic Loss: 54021.2383, W-Loss: 53903.4258, GP: 0.2505, Acc: 0.9438
Step  100/597 - Gen Loss: -1316783.6250, Critic Loss: -16123.7451, W-Loss: -16317.8652, GP: 5.5293, Acc: 0.9219
Step  200/597 - Gen Loss: -1357374.1250, Critic Loss: -966.2819, W-Loss: -1154.9509, GP: 5.3637, Acc: 0.9229
Step  300/597 - Gen Loss: -1334001.6250, Critic Loss: -4009.3813, W-Loss: -4187.5981, GP: 5.3274, Acc: 0.9262
Step  400/597 - Gen Loss: -1340949.2500, Critic Loss: 3486.8301, W-Loss: 3310.6672, GP: 5.4323, Acc: 0.9272
Step  500/597 - Gen Loss: -1346868.5000, Critic Loss: 1889.8473, W-Loss: 1719.1547, GP: 5.2953, Acc: 0.9290

Epoch 41
------------------------------------------------------------
Step    0/597 - Gen Loss: -1491935.2500, Critic Loss: -140766.3906, W-Loss: -140916.1562, GP: 4.4768, Acc: 0.9312
Step  100/597 - Gen Loss: -1369227.6250, Critic Loss: -58.3916, W-Loss: -231.9572, GP: 5.3668, Acc: 0.9284
Step  200/597 - Gen Loss: -1392899.6250, Critic Loss: 11850.8555, W-Loss: 11682.5283, GP: 5.2678, Acc: 0.9306
Step  300/597 - Gen Loss: -1350118.7500, Critic Loss: 5693.1885, W-Loss: 5536.8086, GP: 5.0746, Acc: 0.9346
Step  400/597 - Gen Loss: -1336028.2500, Critic Loss: 9685.6074, W-Loss: 9535.9600, GP: 4.9243, Acc: 0.9366
Step  500/597 - Gen Loss: -1350258.8750, Critic Loss: 2157.9871, W-Loss: 2009.7860, GP: 4.8593, Acc: 0.9373

Epoch 42
------------------------------------------------------------
Step    0/597 - Gen Loss: -1613547.6250, Critic Loss: -401549.4375, W-Loss: -401643.4688, GP: 0.4807, Acc: 0.9312
Step  100/597 - Gen Loss: -1551730.5000, Critic Loss: -18699.4531, W-Loss: -18862.4434, GP: 5.1693, Acc: 0.9340
Step  200/597 - Gen Loss: -1521887.0000, Critic Loss: -10848.1758, W-Loss: -11019.0215, GP: 5.5882, Acc: 0.9332
Step  300/597 - Gen Loss: -1468184.3750, Critic Loss: -14513.4404, W-Loss: -14682.2588, GP: 5.5307, Acc: 0.9342
Step  400/597 - Gen Loss: -1507169.6250, Critic Loss: -10393.9980, W-Loss: -10565.1328, GP: 5.5522, Acc: 0.9332
Step  500/597 - Gen Loss: -1555776.0000, Critic Loss: -12895.3193, W-Loss: -13068.1494, GP: 5.6253, Acc: 0.9334

Epoch 43
------------------------------------------------------------
Step    0/597 - Gen Loss: -1522913.1250, Critic Loss: 120244.9141, W-Loss: 120081.5781, GP: 0.7189, Acc: 0.9438
Step  100/597 - Gen Loss: -1576830.6250, Critic Loss: 6977.1455, W-Loss: 6788.7520, GP: 6.4416, Acc: 0.9344
Step  200/597 - Gen Loss: -1551062.1250, Critic Loss: 1628.5188, W-Loss: 1437.9869, GP: 6.3961, Acc: 0.9332
Step  300/597 - Gen Loss: -1499043.2500, Critic Loss: -5612.9912, W-Loss: -5801.1699, GP: 6.3567, Acc: 0.9340
Step  400/597 - Gen Loss: -1510692.8750, Critic Loss: -3656.7922, W-Loss: -3847.2258, GP: 6.5280, Acc: 0.9339
Step  500/597 - Gen Loss: -1508548.8750, Critic Loss: -3535.3005, W-Loss: -3724.0024, GP: 6.3877, Acc: 0.9342

Epoch 44
------------------------------------------------------------
Step    0/597 - Gen Loss: -1733452.1250, Critic Loss: 195599.5469, W-Loss: 195358.2031, GP: 6.2223, Acc: 0.9187
Step  100/597 - Gen Loss: -1717611.7500, Critic Loss: -2299.0327, W-Loss: -2525.0923, GP: 6.6679, Acc: 0.9200
Step  200/597 - Gen Loss: -1736528.3750, Critic Loss: -1677.5138, W-Loss: -1914.8732, GP: 6.8867, Acc: 0.9194
Step  300/597 - Gen Loss: -1705106.5000, Critic Loss: -9636.4160, W-Loss: -9873.3213, GP: 6.8990, Acc: 0.9197
Step  400/597 - Gen Loss: -1728699.7500, Critic Loss: -2921.2048, W-Loss: -3155.5579, GP: 6.7917, Acc: 0.9199
Step  500/597 - Gen Loss: -1756903.2500, Critic Loss: -13200.2256, W-Loss: -13436.1934, GP: 6.8901, Acc: 0.9199

Epoch 45
------------------------------------------------------------
Step    0/597 - Gen Loss: -1581362.3750, Critic Loss: 92452.8906, W-Loss: 91851.9531, GP: 45.3985, Acc: 0.9312
Step  100/597 - Gen Loss: -1912846.2500, Critic Loss: 10217.4336, W-Loss: 9951.7832, GP: 8.6135, Acc: 0.9147
Step  200/597 - Gen Loss: -1983845.3750, Critic Loss: 1820.5465, W-Loss: 1565.4093, GP: 8.2100, Acc: 0.9175
Step  300/597 - Gen Loss: -1980057.1250, Critic Loss: -15305.5225, W-Loss: -15554.3896, GP: 8.0499, Acc: 0.9188
Step  400/597 - Gen Loss: -2005652.0000, Critic Loss: -16618.5039, W-Loss: -16869.8555, GP: 8.5230, Acc: 0.9196
Step  500/597 - Gen Loss: -2035727.6250, Critic Loss: -11254.6582, W-Loss: -11505.6650, GP: 8.5909, Acc: 0.9204

Epoch 46
------------------------------------------------------------
Step    0/597 - Gen Loss: -1742237.5000, Critic Loss: 1940.2719, W-Loss: 1753.3500, GP: 1.3018, Acc: 0.9500
Step  100/597 - Gen Loss: -2253974.5000, Critic Loss: -30344.5742, W-Loss: -30651.2070, GP: 9.4199, Acc: 0.9089
Step  200/597 - Gen Loss: -2239575.7500, Critic Loss: 10394.9766, W-Loss: 10090.7090, GP: 9.6325, Acc: 0.9093
Step  300/597 - Gen Loss: -2225093.0000, Critic Loss: -726.5741, W-Loss: -1020.5170, GP: 9.3494, Acc: 0.9100
Step  400/597 - Gen Loss: -2232213.5000, Critic Loss: -413.4816, W-Loss: -703.2023, GP: 9.1406, Acc: 0.9105
Step  500/597 - Gen Loss: -2240073.0000, Critic Loss: -4090.3645, W-Loss: -4375.1240, GP: 8.9964, Acc: 0.9114

Epoch 47
------------------------------------------------------------
Step    0/597 - Gen Loss: -1563098.3750, Critic Loss: -637946.0625, W-Loss: -638142.9375, GP: 0.3544, Acc: 0.9000
Step  100/597 - Gen Loss: -2345726.2500, Critic Loss: 12597.7559, W-Loss: 12265.2236, GP: 11.6136, Acc: 0.9078
Step  200/597 - Gen Loss: -2373745.7500, Critic Loss: 10769.4795, W-Loss: 10445.4824, GP: 11.5985, Acc: 0.9091
Step  300/597 - Gen Loss: -2408913.5000, Critic Loss: -14180.1211, W-Loss: -14500.0693, GP: 11.2962, Acc: 0.9101
Step  400/597 - Gen Loss: -2451198.5000, Critic Loss: -8055.6309, W-Loss: -8374.2939, GP: 11.1088, Acc: 0.9095
Step  500/597 - Gen Loss: -2420810.0000, Critic Loss: -2852.0498, W-Loss: -3167.7544, GP: 11.0480, Acc: 0.9108

Epoch 48
------------------------------------------------------------
Step    0/597 - Gen Loss: -1763271.3750, Critic Loss: 292832.5312, W-Loss: 292663.3750, GP: 1.7185, Acc: 0.9406
Step  100/597 - Gen Loss: -2366226.5000, Critic Loss: 18342.8203, W-Loss: 18007.4180, GP: 12.4494, Acc: 0.9110
Step  200/597 - Gen Loss: -2317468.0000, Critic Loss: 27450.3574, W-Loss: 27132.7363, GP: 11.2680, Acc: 0.9122
Step  300/597 - Gen Loss: -2188874.0000, Critic Loss: -4005.7847, W-Loss: -4313.1396, GP: 10.9518, Acc: 0.9134
Step  400/597 - Gen Loss: -2176339.5000, Critic Loss: 3786.9407, W-Loss: 3481.6948, GP: 10.7995, Acc: 0.9142
Step  500/597 - Gen Loss: -2169994.0000, Critic Loss: 2514.9136, W-Loss: 2213.3904, GP: 10.7127, Acc: 0.9147

Epoch 49
------------------------------------------------------------
Step    0/597 - Gen Loss: -2874546.7500, Critic Loss: 255544.8281, W-Loss: 255317.4531, GP: 0.1973, Acc: 0.9062
Step  100/597 - Gen Loss: -2478010.7500, Critic Loss: -5508.9673, W-Loss: -5875.7310, GP: 12.5463, Acc: 0.9026
Step  200/597 - Gen Loss: -2461892.5000, Critic Loss: 10507.3975, W-Loss: 10155.0928, GP: 12.1797, Acc: 0.9056
Step  300/597 - Gen Loss: -2375004.0000, Critic Loss: -313.9174, W-Loss: -654.7145, GP: 11.6865, Acc: 0.9064
Step  400/597 - Gen Loss: -2353695.0000, Critic Loss: 6889.4692, W-Loss: 6557.1958, GP: 11.3695, Acc: 0.9072
Step  500/597 - Gen Loss: -2327234.7500, Critic Loss: -352.0511, W-Loss: -678.1369, GP: 11.1023, Acc: 0.9080

Epoch 50
------------------------------------------------------------
Step    0/597 - Gen Loss: -593770.4375, Critic Loss: 137621.7812, W-Loss: 137412.7031, GP: 0.1847, Acc: 0.9094
Step  100/597 - Gen Loss: -2324717.5000, Critic Loss: -42510.2422, W-Loss: -42860.3906, GP: 12.4381, Acc: 0.9065
Step  200/597 - Gen Loss: -2339635.2500, Critic Loss: -35561.8438, W-Loss: -35917.3281, GP: 12.3743, Acc: 0.9051
Step  300/597 - Gen Loss: -2355899.7500, Critic Loss: -41214.9219, W-Loss: -41567.6758, GP: 12.1067, Acc: 0.9046
Step  400/597 - Gen Loss: -2474133.5000, Critic Loss: -19677.9336, W-Loss: -20038.1973, GP: 12.5141, Acc: 0.9037
Step  500/597 - Gen Loss: -2535366.2500, Critic Loss: -16984.1113, W-Loss: -17343.3438, GP: 12.7008, Acc: 0.9049

Generating Grid of Samples at Final Epoch 50:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Final Generated Samples - All Letter Classes:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
No description has been provided for this image

Observation:¶

  • Generated Samples: ALL letters appear beyond unrecognizable, severe distortions and inconsistent stroke patterns across ALL classes.
  • Training Losses: Generator and critic losses fluctuate heavily, with extreme values → unstable training dynamics.
  • Discriminator Accuracy: Starts stable but declines after around 30 epochs, indicating weakening discriminator performance.
  • Loss Balance: Extremely large and growing gap suggests severe imbalance between generator and critic.

Basic CGAN (Conditional GAN) Implementation¶

This section implements a Conditional GAN following the original CGAN paper architecture, providing controlled generation based on class labels.

CGAN Architecture:¶

  • Conditional Generator: Standard CGAN generator with label conditioning through concatenation
  • Conditional Discriminator: Standard CGAN discriminator with both real/fake and class classification
  • Label conditioning: Direct label embedding and concatenation approach
  • Training strategy: Standard adversarial training with auxiliary classification loss
In [19]:
# CGAN - COMPLETE IMPLEMENTATION

def build_cgan_generator(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """
    Build CGAN generator following original CGAN paper
    Conditional generation through label embedding and concatenation
    """
    print(" Building CGAN Generator...")
    
    # Noise input
    noise_input = tf.keras.layers.Input(shape=(latent_dim,), name='noise_input')
    
    # Label input
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process label - embedding and expand
    label_embedding = tf.keras.layers.Embedding(num_classes, latent_dim)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate noise and label embedding
    combined_input = tf.keras.layers.Concatenate()([noise_input, label_embedding])
    
    # Dense layer to create feature map
    x = tf.keras.layers.Dense(7 * 7 * 256, use_bias=False)(combined_input)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    x = tf.keras.layers.Reshape((7, 7, 256))(x)
    
    # First upsampling: 7x7x256 -> 14x14x128
    x = tf.keras.layers.Conv2DTranspose(128, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Second upsampling: 14x14x128 -> 28x28x64
    x = tf.keras.layers.Conv2DTranspose(64, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Final layer: 28x28x64 -> 28x28x1
    x = tf.keras.layers.Conv2DTranspose(1, 5, strides=1, padding='same', activation='tanh')(x)
    
    model = tf.keras.Model(
        inputs=[noise_input, label_input],
        outputs=x,
        name='cgan_generator'
    )
    
    return model

def build_cgan_discriminator(img_height=28, img_width=28, num_classes=16):
    """
    Build CGAN discriminator following original CGAN paper
    Joint discrimination and classification
    """
    print("Building CGAN Discriminator...")
    
    # Image input
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1), name='img_input')
    
    # Label input for conditioning
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process image
    x = img_input
    
    # First conv block: 28x28x1 -> 14x14x64
    x = tf.keras.layers.Conv2D(64, 5, strides=2, padding='same')(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Second conv block: 14x14x64 -> 7x7x128
    x = tf.keras.layers.Conv2D(128, 5, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Third conv block: 7x7x128 -> 4x4x256
    x = tf.keras.layers.Conv2D(256, 5, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Flatten for dense layers
    x = tf.keras.layers.Flatten()(x)
    
    # Process label
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Concatenate image features and label
    x = tf.keras.layers.Concatenate()([x, label_embedding])
    
    # Dense layers
    x = tf.keras.layers.Dense(1024)(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    # Output layer (real/fake classification)
    validity = tf.keras.layers.Dense(1, activation='sigmoid', name='validity')(x)
    
    # Auxiliary classifier for label prediction
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(x)
    
    model = tf.keras.Model(
        inputs=[img_input, label_input],
        outputs=[validity, label_pred],
        name='cgan_discriminator'
    )
    
    return model

# Build CGAN models
cgan_generator = build_cgan_generator(
    latent_dim=100, 
    num_classes=num_classes, 
    img_height=28, 
    img_width=28
)

cgan_discriminator = build_cgan_discriminator(
    img_height=28, 
    img_width=28, 
    num_classes=num_classes
)
 Building CGAN Generator...
Building CGAN Discriminator...
In [20]:
# =============================================================================
# CGAN TRAINER CLASS
# =============================================================================

class CGANTrainer:
    """CGAN trainer for standard GAN loss with auxiliary classifier"""

    def __init__(self, generator, discriminator, latent_dim=100, num_classes=16):
        self.generator = generator
        self.discriminator = discriminator
        self.latent_dim = latent_dim
        self.num_classes = num_classes

        # Optimizers
        self.gen_optimizer = tf.keras.optimizers.Adam(1e-4)
        self.disc_optimizer = tf.keras.optimizers.Adam(1e-4)

        # Loss functions
        self.bce_loss = tf.keras.losses.BinaryCrossentropy(from_logits=False)
        self.ce_loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)

        # Metrics
        self.gen_loss_metric = tf.keras.metrics.Mean(name='gen_loss')
        self.disc_loss_metric = tf.keras.metrics.Mean(name='disc_loss')
        self.acc_metric = tf.keras.metrics.SparseCategoricalAccuracy(name='label_accuracy')

        # History
        self.history = {
            'gen_loss': [],
            'disc_loss': [],
            'label_accuracy': [],
            'epoch': []
        }

    @tf.function
    def train_step(self, real_images, real_labels):
        batch_size = tf.shape(real_images)[0]
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], minval=0, maxval=self.num_classes, dtype=tf.int32)

        # ---------------------
        # Train Discriminator
        # ---------------------
        with tf.GradientTape() as tape:
            # Generate fake images
            fake_images = self.generator([noise, fake_labels], training=True)

            # Discriminator predictions
            real_validity, real_class_logits = self.discriminator([real_images, real_labels], training=True)
            fake_validity, fake_class_logits = self.discriminator([fake_images, fake_labels], training=True)

            # Validity losses
            real_loss = self.bce_loss(tf.ones_like(real_validity), real_validity)
            fake_loss = self.bce_loss(tf.zeros_like(fake_validity), fake_validity)
            validity_loss = real_loss + fake_loss

            # Auxiliary classification loss
            real_class_loss = self.ce_loss(real_labels, real_class_logits)
            fake_class_loss = self.ce_loss(fake_labels, fake_class_logits)
            class_loss = (real_class_loss + fake_class_loss) / 2.0

            # Total discriminator loss
            disc_loss = validity_loss + class_loss

        gradients = tape.gradient(disc_loss, self.discriminator.trainable_variables)
        self.disc_optimizer.apply_gradients(zip(gradients, self.discriminator.trainable_variables))

        # Update accuracy
        self.acc_metric.update_state(real_labels, real_class_logits)

        # ---------------------
        # Train Generator
        # ---------------------
        noise = tf.random.normal([batch_size, self.latent_dim])
        gen_labels = tf.random.uniform([batch_size], minval=0, maxval=self.num_classes, dtype=tf.int32)

        with tf.GradientTape() as tape:
            generated_images = self.generator([noise, gen_labels], training=True)
            validity, class_logits = self.discriminator([generated_images, gen_labels], training=True)

            adv_loss = self.bce_loss(tf.ones_like(validity), validity)
            aux_loss = self.ce_loss(gen_labels, class_logits)
            gen_loss = adv_loss + aux_loss

        gradients = tape.gradient(gen_loss, self.generator.trainable_variables)
        self.gen_optimizer.apply_gradients(zip(gradients, self.generator.trainable_variables))

        # Update metrics
        self.gen_loss_metric.update_state(gen_loss)
        self.disc_loss_metric.update_state(disc_loss)

    def train_epoch(self, dataset, epoch, steps_per_epoch):
        print(f"\nEpoch {epoch + 1}")
        print("-" * 60)

        self.gen_loss_metric.reset_states()
        self.disc_loss_metric.reset_states()
        self.acc_metric.reset_states()

        for step, (images, labels) in enumerate(dataset.take(steps_per_epoch)):
            self.train_step(images, labels)

            if step % 100 == 0:
                print(f"Step {step:4d}/{steps_per_epoch} - "
                      f"Gen Loss: {self.gen_loss_metric.result():.4f}, "
                      f"Disc Loss: {self.disc_loss_metric.result():.4f}, "
                      f"Acc: {self.acc_metric.result():.4f}")

        # Save epoch results
        self.history['gen_loss'].append(float(self.gen_loss_metric.result()))
        self.history['disc_loss'].append(float(self.disc_loss_metric.result()))
        self.history['label_accuracy'].append(float(self.acc_metric.result()))
        self.history['epoch'].append(epoch)

# Instantiate CGAN trainer before training loop
cgan_trainer = CGANTrainer(
    generator=cgan_generator,
    discriminator=cgan_discriminator,
    latent_dim=100,
    num_classes=num_classes
)
In [32]:
# =============================================================================
# CGAN TRAINING - COMPLETE 50 EPOCH TRAINING
# =============================================================================

# Test visualization first
print("Testing visualization with current CGAN models...")
test_images_cgan, test_labels_cgan = display_generated_samples_grid(
    cgan_generator, class_to_letter, samples_per_class=6
)

# Training configuration
NUM_EPOCHS = 50

start_time = time.time()

for epoch in range(NUM_EPOCHS):
    epoch_start = time.time()
    
    # Train for one epoch
    cgan_trainer.train_epoch(train_dataset, epoch, steps_per_epoch)
    
    # Display only on the final epoch
    if (epoch + 1) == NUM_EPOCHS:
        print(f"\nGenerating Grid of Samples at Final Epoch {epoch + 1}:")
        display_generated_samples_grid(cgan_generator, class_to_letter, epoch + 1, samples_per_class=6)
    
    # Calculate and display epoch timing
    epoch_time = time.time() - epoch_start
    total_time = time.time() - start_time
    avg_time = total_time / (epoch + 1)
    eta = avg_time * (NUM_EPOCHS - epoch - 1)
    
total_training_time = time.time() - start_time

# Generate final display
print(f"\nFinal Generated Samples - All Letter Classes:")
final_images_cgan, final_labels_cgan = display_generated_samples_grid(
    cgan_generator, class_to_letter, NUM_EPOCHS, samples_per_class=6
)

# Plot training progress
if len(cgan_trainer.history['gen_loss']) > 1:
    plt.figure(figsize=(15, 5))
    
    # Generator and Discriminator Loss
    plt.subplot(1, 3, 1)
    epochs = cgan_trainer.history['epoch']
    plt.plot(epochs, cgan_trainer.history['gen_loss'], label='Generator Loss', color='blue', linewidth=2)
    plt.plot(epochs, cgan_trainer.history['disc_loss'], label='Discriminator Loss', color='red', linewidth=2)
    plt.title('CGAN - Training Losses', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Discriminator Accuracy
    plt.subplot(1, 3, 2)
    plt.plot(epochs, cgan_trainer.history['label_accuracy'], label='Discriminator Accuracy', color='green', linewidth=2)
    plt.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='Upper limit (95%)')
    plt.axhline(y=0.70, color='orange', linestyle='--', alpha=0.7, label='Lower limit (70%)')
    plt.title('CGAN - Discriminator Accuracy', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Loss Difference
    plt.subplot(1, 3, 3)
    loss_diff = [g - d for g, d in zip(cgan_trainer.history['gen_loss'], cgan_trainer.history['disc_loss'])]
    plt.plot(epochs, loss_diff, label='Gen Loss - Disc Loss', color='purple', linewidth=2)
    plt.axhline(y=0, color='black', linestyle='-', alpha=0.5)
    plt.title('CGAN - Loss Balance', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss Difference')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()


# Instantiate CGAN trainer before training loop
cgan_trainer = CGANTrainer(
    generator=cgan_generator,
    discriminator=cgan_discriminator,
    latent_dim=100,
    num_classes=num_classes
)
Testing visualization with current CGAN models...
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Epoch 1
------------------------------------------------------------
Step    0/597 - Gen Loss: 5.0598, Disc Loss: 5.6012, Acc: 0.0156
Step  100/597 - Gen Loss: 7.1038, Disc Loss: 3.5386, Acc: 0.3032
Step  200/597 - Gen Loss: 8.2649, Disc Loss: 2.9731, Acc: 0.4141
Step  300/597 - Gen Loss: 8.9221, Disc Loss: 2.7782, Acc: 0.4797
Step  400/597 - Gen Loss: 8.5059, Disc Loss: 2.5515, Acc: 0.5247
Step  500/597 - Gen Loss: 8.0641, Disc Loss: 2.3010, Acc: 0.5629

Epoch 2
------------------------------------------------------------
Step    0/597 - Gen Loss: 5.8132, Disc Loss: 0.8863, Acc: 0.7344
Step  100/597 - Gen Loss: 4.9323, Disc Loss: 0.8838, Acc: 0.7890
Step  200/597 - Gen Loss: 4.8511, Disc Loss: 0.9624, Acc: 0.8039
Step  300/597 - Gen Loss: 4.5862, Disc Loss: 0.9849, Acc: 0.8202
Step  400/597 - Gen Loss: 4.3214, Disc Loss: 0.9717, Acc: 0.8381
Step  500/597 - Gen Loss: 4.1598, Disc Loss: 0.9568, Acc: 0.8533

Epoch 3
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.9853, Disc Loss: 0.7731, Acc: 0.9688
Step  100/597 - Gen Loss: 2.8038, Disc Loss: 0.8524, Acc: 0.9503
Step  200/597 - Gen Loss: 2.6810, Disc Loss: 0.8490, Acc: 0.9580
Step  300/597 - Gen Loss: 2.6080, Disc Loss: 0.8283, Acc: 0.9621
Step  400/597 - Gen Loss: 2.5664, Disc Loss: 0.8224, Acc: 0.9655
Step  500/597 - Gen Loss: 2.5416, Disc Loss: 0.8191, Acc: 0.9681

Epoch 4
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.8358, Disc Loss: 0.7495, Acc: 1.0000
Step  100/597 - Gen Loss: 2.3346, Disc Loss: 0.8547, Acc: 0.9851
Step  200/597 - Gen Loss: 2.2474, Disc Loss: 0.8626, Acc: 0.9867
Step  300/597 - Gen Loss: 2.2162, Disc Loss: 0.8653, Acc: 0.9877
Step  400/597 - Gen Loss: 2.1724, Disc Loss: 0.8674, Acc: 0.9887
Step  500/597 - Gen Loss: 2.1381, Disc Loss: 0.8776, Acc: 0.9893

Epoch 5
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.6909, Disc Loss: 0.8477, Acc: 0.9844
Step  100/597 - Gen Loss: 1.9312, Disc Loss: 0.9085, Acc: 0.9955
Step  200/597 - Gen Loss: 1.9520, Disc Loss: 0.8846, Acc: 0.9955
Step  300/597 - Gen Loss: 1.9724, Disc Loss: 0.8648, Acc: 0.9958
Step  400/597 - Gen Loss: 2.0173, Disc Loss: 0.8302, Acc: 0.9958
Step  500/597 - Gen Loss: 2.0746, Disc Loss: 0.8100, Acc: 0.9962

Epoch 6
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.3898, Disc Loss: 0.5754, Acc: 1.0000
Step  100/597 - Gen Loss: 2.3625, Disc Loss: 0.7735, Acc: 0.9983
Step  200/597 - Gen Loss: 2.4155, Disc Loss: 0.7672, Acc: 0.9978
Step  300/597 - Gen Loss: 2.4575, Disc Loss: 0.7491, Acc: 0.9983
Step  400/597 - Gen Loss: 2.4752, Disc Loss: 0.7528, Acc: 0.9981
Step  500/597 - Gen Loss: 2.4644, Disc Loss: 0.7636, Acc: 0.9983

Epoch 7
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.7608, Disc Loss: 0.7054, Acc: 1.0000
Step  100/597 - Gen Loss: 2.1904, Disc Loss: 0.8092, Acc: 0.9994
Step  200/597 - Gen Loss: 2.1649, Disc Loss: 0.7810, Acc: 0.9992
Step  300/597 - Gen Loss: 2.1906, Disc Loss: 0.7829, Acc: 0.9992
Step  400/597 - Gen Loss: 2.1817, Disc Loss: 0.7955, Acc: 0.9992
Step  500/597 - Gen Loss: 2.1735, Disc Loss: 0.8008, Acc: 0.9992

Epoch 8
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.1570, Disc Loss: 1.2238, Acc: 1.0000
Step  100/597 - Gen Loss: 2.1324, Disc Loss: 0.8146, Acc: 0.9995
Step  200/597 - Gen Loss: 2.1263, Disc Loss: 0.8124, Acc: 0.9997
Step  300/597 - Gen Loss: 2.1203, Disc Loss: 0.8132, Acc: 0.9994
Step  400/597 - Gen Loss: 2.1138, Disc Loss: 0.8205, Acc: 0.9993
Step  500/597 - Gen Loss: 2.1246, Disc Loss: 0.8058, Acc: 0.9994

Epoch 9
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.9921, Disc Loss: 0.9112, Acc: 1.0000
Step  100/597 - Gen Loss: 2.1556, Disc Loss: 0.8405, Acc: 0.9995
Step  200/597 - Gen Loss: 2.1151, Disc Loss: 0.8227, Acc: 0.9995
Step  300/597 - Gen Loss: 2.0856, Disc Loss: 0.8267, Acc: 0.9996
Step  400/597 - Gen Loss: 2.0959, Disc Loss: 0.8326, Acc: 0.9996
Step  500/597 - Gen Loss: 2.0823, Disc Loss: 0.8328, Acc: 0.9996

Epoch 10
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.6363, Disc Loss: 1.1065, Acc: 1.0000
Step  100/597 - Gen Loss: 2.0544, Disc Loss: 0.8455, Acc: 0.9998
Step  200/597 - Gen Loss: 2.0135, Disc Loss: 0.8574, Acc: 0.9998
Step  300/597 - Gen Loss: 2.0266, Disc Loss: 0.8414, Acc: 0.9998
Step  400/597 - Gen Loss: 2.0270, Disc Loss: 0.8512, Acc: 0.9998
Step  500/597 - Gen Loss: 2.0100, Disc Loss: 0.8498, Acc: 0.9998

Epoch 11
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.4414, Disc Loss: 0.9911, Acc: 1.0000
Step  100/597 - Gen Loss: 1.9453, Disc Loss: 0.8387, Acc: 0.9998
Step  200/597 - Gen Loss: 1.9198, Disc Loss: 0.8522, Acc: 0.9998
Step  300/597 - Gen Loss: 1.9053, Disc Loss: 0.8756, Acc: 0.9998
Step  400/597 - Gen Loss: 1.9128, Disc Loss: 0.8804, Acc: 0.9998
Step  500/597 - Gen Loss: 1.8853, Disc Loss: 0.8885, Acc: 0.9999

Epoch 12
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.5934, Disc Loss: 0.6902, Acc: 1.0000
Step  100/597 - Gen Loss: 1.8064, Disc Loss: 0.9090, Acc: 0.9998
Step  200/597 - Gen Loss: 1.7954, Disc Loss: 0.9177, Acc: 0.9999
Step  300/597 - Gen Loss: 1.8078, Disc Loss: 0.9171, Acc: 0.9999
Step  400/597 - Gen Loss: 1.7997, Disc Loss: 0.9176, Acc: 0.9999
Step  500/597 - Gen Loss: 1.7845, Disc Loss: 0.9153, Acc: 0.9998

Epoch 13
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.0378, Disc Loss: 0.7232, Acc: 1.0000
Step  100/597 - Gen Loss: 1.7785, Disc Loss: 0.9370, Acc: 0.9997
Step  200/597 - Gen Loss: 1.7601, Disc Loss: 0.9278, Acc: 0.9998
Step  300/597 - Gen Loss: 1.7421, Disc Loss: 0.9281, Acc: 0.9998
Step  400/597 - Gen Loss: 1.7518, Disc Loss: 0.9290, Acc: 0.9999
Step  500/597 - Gen Loss: 1.7517, Disc Loss: 0.9341, Acc: 0.9999

Epoch 14
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1977, Disc Loss: 0.8241, Acc: 1.0000
Step  100/597 - Gen Loss: 1.6852, Disc Loss: 0.9497, Acc: 0.9998
Step  200/597 - Gen Loss: 1.6897, Disc Loss: 0.9255, Acc: 0.9999
Step  300/597 - Gen Loss: 1.7218, Disc Loss: 0.9219, Acc: 0.9999
Step  400/597 - Gen Loss: 1.7330, Disc Loss: 0.9273, Acc: 0.9998
Step  500/597 - Gen Loss: 1.7309, Disc Loss: 0.9336, Acc: 0.9998

Epoch 15
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.3696, Disc Loss: 0.7744, Acc: 1.0000
Step  100/597 - Gen Loss: 1.7507, Disc Loss: 0.9422, Acc: 0.9998
Step  200/597 - Gen Loss: 1.7196, Disc Loss: 0.9526, Acc: 0.9999
Step  300/597 - Gen Loss: 1.7010, Disc Loss: 0.9538, Acc: 0.9999
Step  400/597 - Gen Loss: 1.6872, Disc Loss: 0.9557, Acc: 0.9999
Step  500/597 - Gen Loss: 1.6914, Disc Loss: 0.9571, Acc: 0.9999

Epoch 16
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.6109, Disc Loss: 1.1583, Acc: 1.0000
Step  100/597 - Gen Loss: 1.7017, Disc Loss: 0.9604, Acc: 0.9998
Step  200/597 - Gen Loss: 1.7061, Disc Loss: 0.9564, Acc: 0.9998
Step  300/597 - Gen Loss: 1.6871, Disc Loss: 0.9593, Acc: 0.9998
Step  400/597 - Gen Loss: 1.6920, Disc Loss: 0.9499, Acc: 0.9999
Step  500/597 - Gen Loss: 1.6834, Disc Loss: 0.9550, Acc: 0.9999

Epoch 17
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.3723, Disc Loss: 0.7417, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5921, Disc Loss: 0.9806, Acc: 1.0000
Step  200/597 - Gen Loss: 1.6177, Disc Loss: 0.9761, Acc: 1.0000
Step  300/597 - Gen Loss: 1.6230, Disc Loss: 0.9745, Acc: 1.0000
Step  400/597 - Gen Loss: 1.6374, Disc Loss: 0.9661, Acc: 0.9999
Step  500/597 - Gen Loss: 1.6426, Disc Loss: 0.9615, Acc: 0.9999

Epoch 18
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.8726, Disc Loss: 1.0343, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5486, Disc Loss: 0.9698, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5739, Disc Loss: 0.9807, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5685, Disc Loss: 0.9723, Acc: 0.9999
Step  400/597 - Gen Loss: 1.5745, Disc Loss: 0.9718, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5804, Disc Loss: 0.9656, Acc: 1.0000

Epoch 19
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.1206, Disc Loss: 0.9742, Acc: 1.0000
Step  100/597 - Gen Loss: 1.6684, Disc Loss: 0.9933, Acc: 1.0000
Step  200/597 - Gen Loss: 1.6552, Disc Loss: 0.9777, Acc: 1.0000
Step  300/597 - Gen Loss: 1.6360, Disc Loss: 0.9693, Acc: 1.0000
Step  400/597 - Gen Loss: 1.6339, Disc Loss: 0.9661, Acc: 1.0000
Step  500/597 - Gen Loss: 1.6218, Disc Loss: 0.9659, Acc: 1.0000

Epoch 20
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.9822, Disc Loss: 0.9079, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5733, Disc Loss: 1.0006, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5603, Disc Loss: 0.9922, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5838, Disc Loss: 0.9789, Acc: 1.0000
Step  400/597 - Gen Loss: 1.6020, Disc Loss: 0.9673, Acc: 1.0000
Step  500/597 - Gen Loss: 1.6050, Disc Loss: 0.9677, Acc: 1.0000

Epoch 21
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.6322, Disc Loss: 0.7747, Acc: 1.0000
Step  100/597 - Gen Loss: 1.6354, Disc Loss: 0.9954, Acc: 1.0000
Step  200/597 - Gen Loss: 1.6006, Disc Loss: 0.9921, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5870, Disc Loss: 1.0033, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5722, Disc Loss: 0.9983, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5603, Disc Loss: 0.9937, Acc: 1.0000

Epoch 22
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.8528, Disc Loss: 0.9226, Acc: 1.0000
Step  100/597 - Gen Loss: 1.4967, Disc Loss: 1.0180, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5102, Disc Loss: 1.0120, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5346, Disc Loss: 0.9920, Acc: 0.9999
Step  400/597 - Gen Loss: 1.5412, Disc Loss: 0.9982, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5354, Disc Loss: 0.9980, Acc: 1.0000

Epoch 23
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.4286, Disc Loss: 0.8563, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5528, Disc Loss: 0.9853, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5578, Disc Loss: 0.9776, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5650, Disc Loss: 0.9724, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5747, Disc Loss: 0.9810, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5506, Disc Loss: 0.9923, Acc: 1.0000

Epoch 24
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.4679, Disc Loss: 0.9005, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5223, Disc Loss: 0.9916, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5264, Disc Loss: 1.0003, Acc: 0.9999
Step  300/597 - Gen Loss: 1.5114, Disc Loss: 0.9940, Acc: 0.9999
Step  400/597 - Gen Loss: 1.5274, Disc Loss: 0.9903, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5351, Disc Loss: 0.9927, Acc: 1.0000

Epoch 25
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.2026, Disc Loss: 0.9910, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5138, Disc Loss: 1.0031, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5116, Disc Loss: 1.0039, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5214, Disc Loss: 0.9927, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5404, Disc Loss: 0.9849, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5401, Disc Loss: 0.9852, Acc: 1.0000

Epoch 26
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0801, Disc Loss: 1.1843, Acc: 1.0000
Step  100/597 - Gen Loss: 1.4476, Disc Loss: 1.0159, Acc: 1.0000
Step  200/597 - Gen Loss: 1.4513, Disc Loss: 1.0099, Acc: 1.0000
Step  300/597 - Gen Loss: 1.4622, Disc Loss: 0.9985, Acc: 1.0000
Step  400/597 - Gen Loss: 1.4740, Disc Loss: 0.9995, Acc: 1.0000
Step  500/597 - Gen Loss: 1.4961, Disc Loss: 0.9953, Acc: 1.0000

Epoch 27
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1296, Disc Loss: 0.6856, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5359, Disc Loss: 0.9997, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5140, Disc Loss: 0.9918, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5326, Disc Loss: 0.9884, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5417, Disc Loss: 0.9846, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5387, Disc Loss: 0.9946, Acc: 1.0000

Epoch 28
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.3007, Disc Loss: 1.0598, Acc: 1.0000
Step  100/597 - Gen Loss: 1.4465, Disc Loss: 1.0062, Acc: 0.9998
Step  200/597 - Gen Loss: 1.4765, Disc Loss: 1.0059, Acc: 0.9999
Step  300/597 - Gen Loss: 1.4908, Disc Loss: 1.0081, Acc: 0.9999
Step  400/597 - Gen Loss: 1.4979, Disc Loss: 0.9945, Acc: 0.9999
Step  500/597 - Gen Loss: 1.5041, Disc Loss: 1.0024, Acc: 0.9999

Epoch 29
------------------------------------------------------------
Step    0/597 - Gen Loss: 0.9484, Disc Loss: 1.3155, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5220, Disc Loss: 0.9905, Acc: 0.9998
Step  200/597 - Gen Loss: 1.5098, Disc Loss: 0.9795, Acc: 0.9999
Step  300/597 - Gen Loss: 1.5227, Disc Loss: 0.9765, Acc: 0.9999
Step  400/597 - Gen Loss: 1.5352, Disc Loss: 0.9702, Acc: 0.9999
Step  500/597 - Gen Loss: 1.5328, Disc Loss: 0.9781, Acc: 0.9999

Epoch 30
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.5309, Disc Loss: 0.8102, Acc: 1.0000
Step  100/597 - Gen Loss: 1.6077, Disc Loss: 0.9597, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5573, Disc Loss: 0.9951, Acc: 0.9999
Step  300/597 - Gen Loss: 1.5474, Disc Loss: 1.0031, Acc: 0.9999
Step  400/597 - Gen Loss: 1.5521, Disc Loss: 0.9992, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5461, Disc Loss: 0.9996, Acc: 0.9999

Epoch 31
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0427, Disc Loss: 0.7163, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5034, Disc Loss: 0.9870, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5642, Disc Loss: 0.9876, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5366, Disc Loss: 0.9899, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5163, Disc Loss: 0.9922, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5180, Disc Loss: 0.9899, Acc: 1.0000

Epoch 32
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.5243, Disc Loss: 1.0807, Acc: 1.0000
Step  100/597 - Gen Loss: 1.4988, Disc Loss: 1.0195, Acc: 1.0000
Step  200/597 - Gen Loss: 1.4719, Disc Loss: 1.0246, Acc: 1.0000
Step  300/597 - Gen Loss: 1.4759, Disc Loss: 1.0146, Acc: 1.0000
Step  400/597 - Gen Loss: 1.4857, Disc Loss: 1.0043, Acc: 1.0000
Step  500/597 - Gen Loss: 1.4937, Disc Loss: 0.9921, Acc: 1.0000

Epoch 33
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1048, Disc Loss: 0.9096, Acc: 1.0000
Step  100/597 - Gen Loss: 1.4984, Disc Loss: 0.9752, Acc: 0.9998
Step  200/597 - Gen Loss: 1.5341, Disc Loss: 0.9853, Acc: 0.9999
Step  300/597 - Gen Loss: 1.5421, Disc Loss: 0.9743, Acc: 0.9999
Step  400/597 - Gen Loss: 1.5576, Disc Loss: 0.9694, Acc: 0.9999
Step  500/597 - Gen Loss: 1.5514, Disc Loss: 0.9768, Acc: 0.9999

Epoch 34
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.3499, Disc Loss: 0.8546, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5342, Disc Loss: 1.0657, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5402, Disc Loss: 1.0137, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5392, Disc Loss: 0.9921, Acc: 0.9999
Step  400/597 - Gen Loss: 1.5426, Disc Loss: 0.9714, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5597, Disc Loss: 0.9764, Acc: 1.0000

Epoch 35
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1955, Disc Loss: 1.1284, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5625, Disc Loss: 1.0004, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5398, Disc Loss: 0.9904, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5538, Disc Loss: 0.9789, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5649, Disc Loss: 0.9742, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5761, Disc Loss: 0.9695, Acc: 1.0000

Epoch 36
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.6861, Disc Loss: 1.0736, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5386, Disc Loss: 1.0138, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5441, Disc Loss: 0.9887, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5504, Disc Loss: 0.9932, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5577, Disc Loss: 0.9743, Acc: 1.0000
Step  500/597 - Gen Loss: 1.5815, Disc Loss: 0.9705, Acc: 1.0000

Epoch 37
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.3817, Disc Loss: 0.7744, Acc: 1.0000
Step  100/597 - Gen Loss: 1.5922, Disc Loss: 0.9536, Acc: 1.0000
Step  200/597 - Gen Loss: 1.5994, Disc Loss: 0.9640, Acc: 1.0000
Step  300/597 - Gen Loss: 1.5856, Disc Loss: 0.9547, Acc: 1.0000
Step  400/597 - Gen Loss: 1.5917, Disc Loss: 0.9442, Acc: 1.0000
Step  500/597 - Gen Loss: 1.6120, Disc Loss: 0.9535, Acc: 1.0000

Epoch 38
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.5896, Disc Loss: 0.9153, Acc: 1.0000
Step  100/597 - Gen Loss: 1.6128, Disc Loss: 0.9406, Acc: 1.0000
Step  200/597 - Gen Loss: 1.6142, Disc Loss: 0.9533, Acc: 1.0000
Step  300/597 - Gen Loss: 1.6488, Disc Loss: 0.9497, Acc: 1.0000
Step  400/597 - Gen Loss: 1.6636, Disc Loss: 0.9520, Acc: 1.0000
Step  500/597 - Gen Loss: 1.6630, Disc Loss: 0.9580, Acc: 1.0000

Epoch 39
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.0271, Disc Loss: 0.8042, Acc: 1.0000
Step  100/597 - Gen Loss: 1.7471, Disc Loss: 0.8999, Acc: 1.0000
Step  200/597 - Gen Loss: 1.7032, Disc Loss: 0.9289, Acc: 1.0000
Step  300/597 - Gen Loss: 1.6981, Disc Loss: 0.9281, Acc: 1.0000
Step  400/597 - Gen Loss: 1.7241, Disc Loss: 0.9256, Acc: 1.0000
Step  500/597 - Gen Loss: 1.7191, Disc Loss: 0.9342, Acc: 1.0000

Epoch 40
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.3179, Disc Loss: 0.8334, Acc: 1.0000
Step  100/597 - Gen Loss: 1.7622, Disc Loss: 0.8713, Acc: 1.0000
Step  200/597 - Gen Loss: 1.8087, Disc Loss: 0.9049, Acc: 1.0000
Step  300/597 - Gen Loss: 1.8252, Disc Loss: 0.9030, Acc: 0.9999
Step  400/597 - Gen Loss: 1.8056, Disc Loss: 0.9081, Acc: 1.0000
Step  500/597 - Gen Loss: 1.7988, Disc Loss: 0.9066, Acc: 1.0000

Epoch 41
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.0459, Disc Loss: 1.0854, Acc: 1.0000
Step  100/597 - Gen Loss: 1.7037, Disc Loss: 0.9490, Acc: 1.0000
Step  200/597 - Gen Loss: 1.7028, Disc Loss: 0.9570, Acc: 1.0000
Step  300/597 - Gen Loss: 1.6937, Disc Loss: 0.9611, Acc: 1.0000
Step  400/597 - Gen Loss: 1.7075, Disc Loss: 0.9491, Acc: 1.0000
Step  500/597 - Gen Loss: 1.7204, Disc Loss: 0.9540, Acc: 1.0000

Epoch 42
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.2901, Disc Loss: 0.8918, Acc: 1.0000
Step  100/597 - Gen Loss: 1.7661, Disc Loss: 0.9183, Acc: 0.9998
Step  200/597 - Gen Loss: 1.7554, Disc Loss: 0.9054, Acc: 0.9999
Step  300/597 - Gen Loss: 1.7868, Disc Loss: 0.8946, Acc: 0.9999
Step  400/597 - Gen Loss: 1.7972, Disc Loss: 0.8886, Acc: 1.0000
Step  500/597 - Gen Loss: 1.7653, Disc Loss: 0.9166, Acc: 1.0000

Epoch 43
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.8830, Disc Loss: 1.4287, Acc: 1.0000
Step  100/597 - Gen Loss: 1.8709, Disc Loss: 0.8817, Acc: 1.0000
Step  200/597 - Gen Loss: 1.8296, Disc Loss: 0.9213, Acc: 1.0000
Step  300/597 - Gen Loss: 1.7817, Disc Loss: 0.9033, Acc: 1.0000
Step  400/597 - Gen Loss: 1.7889, Disc Loss: 0.8976, Acc: 1.0000
Step  500/597 - Gen Loss: 1.8012, Disc Loss: 0.8948, Acc: 1.0000

Epoch 44
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.2190, Disc Loss: 0.8221, Acc: 1.0000
Step  100/597 - Gen Loss: 1.6523, Disc Loss: 0.9457, Acc: 0.9998
Step  200/597 - Gen Loss: 1.6693, Disc Loss: 0.9236, Acc: 0.9999
Step  300/597 - Gen Loss: 1.6994, Disc Loss: 0.9208, Acc: 0.9999
Step  400/597 - Gen Loss: 1.7166, Disc Loss: 0.9094, Acc: 1.0000
Step  500/597 - Gen Loss: 1.7305, Disc Loss: 0.9080, Acc: 1.0000

Epoch 45
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.7449, Disc Loss: 0.3622, Acc: 1.0000
Step  100/597 - Gen Loss: 1.8344, Disc Loss: 0.8806, Acc: 1.0000
Step  200/597 - Gen Loss: 1.8582, Disc Loss: 0.8833, Acc: 1.0000
Step  300/597 - Gen Loss: 1.8269, Disc Loss: 0.8788, Acc: 1.0000
Step  400/597 - Gen Loss: 1.8347, Disc Loss: 0.8806, Acc: 1.0000
Step  500/597 - Gen Loss: 1.8798, Disc Loss: 0.8577, Acc: 1.0000

Epoch 46
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.7991, Disc Loss: 1.1140, Acc: 1.0000
Step  100/597 - Gen Loss: 1.8458, Disc Loss: 0.9426, Acc: 1.0000
Step  200/597 - Gen Loss: 1.8739, Disc Loss: 0.8929, Acc: 1.0000
Step  300/597 - Gen Loss: 1.8811, Disc Loss: 0.8685, Acc: 1.0000
Step  400/597 - Gen Loss: 1.8846, Disc Loss: 0.8838, Acc: 1.0000
Step  500/597 - Gen Loss: 1.8523, Disc Loss: 0.8937, Acc: 1.0000

Epoch 47
------------------------------------------------------------
Step    0/597 - Gen Loss: 0.8511, Disc Loss: 0.9797, Acc: 1.0000
Step  100/597 - Gen Loss: 1.9912, Disc Loss: 0.8249, Acc: 1.0000
Step  200/597 - Gen Loss: 1.9348, Disc Loss: 0.8494, Acc: 1.0000
Step  300/597 - Gen Loss: 1.9125, Disc Loss: 0.8593, Acc: 1.0000
Step  400/597 - Gen Loss: 1.9001, Disc Loss: 0.8672, Acc: 1.0000
Step  500/597 - Gen Loss: 1.8875, Disc Loss: 0.8750, Acc: 1.0000

Epoch 48
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.7912, Disc Loss: 0.7600, Acc: 1.0000
Step  100/597 - Gen Loss: 1.8537, Disc Loss: 0.8658, Acc: 1.0000
Step  200/597 - Gen Loss: 1.9122, Disc Loss: 0.8806, Acc: 1.0000
Step  300/597 - Gen Loss: 1.8679, Disc Loss: 0.8863, Acc: 1.0000
Step  400/597 - Gen Loss: 1.8963, Disc Loss: 0.8600, Acc: 1.0000
Step  500/597 - Gen Loss: 1.8973, Disc Loss: 0.8810, Acc: 1.0000

Epoch 49
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.5502, Disc Loss: 0.6530, Acc: 1.0000
Step  100/597 - Gen Loss: 2.0059, Disc Loss: 0.7974, Acc: 1.0000
Step  200/597 - Gen Loss: 1.9342, Disc Loss: 0.8670, Acc: 1.0000
Step  300/597 - Gen Loss: 1.8962, Disc Loss: 0.8771, Acc: 1.0000
Step  400/597 - Gen Loss: 1.8617, Disc Loss: 0.8795, Acc: 1.0000
Step  500/597 - Gen Loss: 1.8400, Disc Loss: 0.8835, Acc: 1.0000

Epoch 50
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.0670, Disc Loss: 0.9121, Acc: 1.0000
Step  100/597 - Gen Loss: 1.9531, Disc Loss: 0.8352, Acc: 1.0000
Step  200/597 - Gen Loss: 1.9198, Disc Loss: 0.8753, Acc: 1.0000
Step  300/597 - Gen Loss: 1.9161, Disc Loss: 0.8558, Acc: 1.0000
Step  400/597 - Gen Loss: 1.8767, Disc Loss: 0.8702, Acc: 1.0000
Step  500/597 - Gen Loss: 1.9094, Disc Loss: 0.8628, Acc: 1.0000

Generating Grid of Samples at Final Epoch 50:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Final Generated Samples - All Letter Classes:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
No description has been provided for this image

Observation:¶

  • Generated Samples: Letters are mostly recognizable with better structure than some GAN variants, but several classes with slight inconsistencies.
  • Training Losses: Both generator and discriminator losses stabilize after early sharp drops, indicating balanced but possibly slow improvement.
  • Discriminator Accuracy: Rapidly reaching 99% and stays there, suggesting discriminator dominance and potential generator underfitting.
  • Loss Balance: Steep early drop followed by stabilization near low values

Enhanced CGAN Implementation¶

This section implements an Advanced Enhanced CGAN with state-of-the-art improvements over the standard CGAN:

Enhanced Architecture:¶

  • Enhanced Generator: Residual blocks + Self-attention mechanism + Advanced label conditioning + Progressive upsampling
  • Enhanced Discriminator: Spectral normalization + Multi-scale discrimination + Advanced label conditioning + Improved stability
  • Advanced conditioning: Feature-wise linear modulation (FiLM) for better label conditioning
  • Training strategy: Advanced adversarial training with gradient penalty and auxiliary losses

Key Improvements:¶

  • Residual connections: Better gradient flow and training stability
  • Self-attention: Improved spatial coherence and detail generation
  • Spectral normalization: Stabilized training and improved convergence
  • Feature-wise modulation: More effective conditional generation
  • Progressive training: Better quality progression and stabilitY
In [34]:
# ENHANCED CGAN - COMPLETE IMPLEMENTATION

class SpectralNormalization(tf.keras.layers.Wrapper):
    """Spectral Normalization wrapper for improved training stability"""
    
    def __init__(self, layer, **kwargs):
        super(SpectralNormalization, self).__init__(layer, **kwargs)
        
    def build(self, input_shape):
        self.layer.build(input_shape)
        self.w = self.layer.kernel
        self.w_shape = self.w.shape.as_list()
        
        self.u = self.add_weight(
            shape=(1, self.w_shape[-1]),
            initializer='random_normal',
            trainable=False,
            name='u'
        )
        
        super(SpectralNormalization, self).build()
        
    def call(self, inputs, training=None):
        self.update_weights()
        return self.layer(inputs)
        
    def update_weights(self):
        w_reshaped = tf.reshape(self.w, [-1, self.w_shape[-1]])
        
        u_hat = self.u
        v_hat = tf.nn.l2_normalize(tf.matmul(u_hat, w_reshaped, transpose_b=True))
        u_hat = tf.nn.l2_normalize(tf.matmul(v_hat, w_reshaped))
        
        sigma = tf.matmul(tf.matmul(v_hat, w_reshaped), u_hat, transpose_b=True)
        self.w.assign(self.w / sigma)
        self.u.assign(u_hat)

class SelfAttention(tf.keras.layers.Layer):
    """Self-attention mechanism for better spatial coherence"""
    
    def __init__(self, **kwargs):
        super(SelfAttention, self).__init__(**kwargs)
        
    def build(self, input_shape):
        # Convert TensorShape to int
        channels = int(input_shape[-1])
        
        self.query_conv = tf.keras.layers.Conv2D(channels // 8, 1)
        self.key_conv = tf.keras.layers.Conv2D(channels // 8, 1)
        self.value_conv = tf.keras.layers.Conv2D(channels, 1)
        self.gamma = self.add_weight(
            shape=(),
            initializer='zeros',
            trainable=True,
            name='gamma'
        )
        super(SelfAttention, self).build(input_shape)
        
    def call(self, inputs):
        batch_size, height, width, channels = tf.shape(inputs)[0], tf.shape(inputs)[1], tf.shape(inputs)[2], tf.shape(inputs)[3]
        
        # Generate query, key, value
        query = self.query_conv(inputs)
        key = self.key_conv(inputs)
        value = self.value_conv(inputs)
        
        # Reshape for attention computation
        query = tf.reshape(query, [batch_size, -1, tf.shape(query)[-1]])
        key = tf.reshape(key, [batch_size, -1, tf.shape(key)[-1]])
        value = tf.reshape(value, [batch_size, -1, tf.shape(value)[-1]])
        
        # Compute attention
        attention = tf.nn.softmax(tf.matmul(query, key, transpose_b=True))
        out = tf.matmul(attention, value)
        
        # Reshape back
        out = tf.reshape(out, [batch_size, height, width, channels])
        
        return self.gamma * out + inputs

class FiLM(tf.keras.layers.Layer):
    """Feature-wise Linear Modulation for better conditional generation"""
    
    def __init__(self, num_classes, **kwargs):
        super(FiLM, self).__init__(**kwargs)
        self.num_classes = num_classes
        
    def build(self, input_shape):
        # Convert TensorShape to int
        self.channels = int(input_shape[-1])
        
        self.gamma_dense = tf.keras.layers.Dense(self.channels)
        self.beta_dense = tf.keras.layers.Dense(self.channels)
        self.label_embedding = tf.keras.layers.Embedding(self.num_classes, 128)
        super(FiLM, self).build(input_shape)
        
    def call(self, inputs, labels):
        # inputs: [batch, height, width, channels]
        # labels: [batch]
        
        # Generate conditioning parameters
        label_emb = self.label_embedding(labels)
        gamma = self.gamma_dense(label_emb)
        beta = self.beta_dense(label_emb)
        
        # Reshape for broadcasting
        gamma = tf.expand_dims(tf.expand_dims(gamma, 1), 1)
        beta = tf.expand_dims(tf.expand_dims(beta, 1), 1)
        
        # Apply FiLM
        return gamma * inputs + beta

def build_enhanced_cgan_generator(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """Enhanced CGAN generator with advanced features"""
    print("🔨 Building Enhanced CGAN Generator...")
    
    # Inputs
    noise_input = tf.keras.layers.Input(shape=(latent_dim,), name='noise_input')
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Enhanced label conditioning
    label_embedding = tf.keras.layers.Embedding(num_classes, latent_dim)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Combine noise and label
    combined = tf.keras.layers.Concatenate()([noise_input, label_embedding])
    
    # Dense layer with residual connection
    x = tf.keras.layers.Dense(7 * 7 * 512, use_bias=False)(combined)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    x = tf.keras.layers.Reshape((7, 7, 512))(x)
    
    # First upsampling block with FiLM conditioning
    x = tf.keras.layers.Conv2DTranspose(256, 4, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    
    # Apply FiLM conditioning
    film_layer_1 = FiLM(num_classes)
    x = film_layer_1(x, label_input)
    x = tf.keras.layers.ReLU()(x)
    
    # Self-attention for better spatial coherence
    x = SelfAttention()(x)
    
    # Second upsampling block
    x = tf.keras.layers.Conv2DTranspose(128, 4, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    
    # Apply FiLM conditioning
    film_layer_2 = FiLM(num_classes)
    x = film_layer_2(x, label_input)
    x = tf.keras.layers.ReLU()(x)
    
    # Final output layer
    x = tf.keras.layers.Conv2D(1, 3, padding='same', activation='tanh')(x)
    
    model = tf.keras.Model(
        inputs=[noise_input, label_input],
        outputs=x,
        name='enhanced_cgan_generator'
    )
    
    return model

def build_enhanced_cgan_discriminator(img_height=28, img_width=28, num_classes=16):
    """Enhanced CGAN discriminator with spectral normalization"""

    # Inputs
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1), name='img_input')
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process image with spectral normalization
    x = img_input
    
    # First conv block with spectral norm
    x = SpectralNormalization(tf.keras.layers.Conv2D(64, 4, strides=2, padding='same'))(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Second conv block
    x = SpectralNormalization(tf.keras.layers.Conv2D(128, 4, strides=2, padding='same'))(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Third conv block with self-attention
    x = SpectralNormalization(tf.keras.layers.Conv2D(256, 4, strides=2, padding='same'))(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = SelfAttention()(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Fourth conv block
    x = SpectralNormalization(tf.keras.layers.Conv2D(512, 4, strides=2, padding='same'))(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.4)(x)
    
    # Flatten and process label
    x = tf.keras.layers.Flatten()(x)
    
    # Enhanced label processing
    label_embedding = tf.keras.layers.Embedding(num_classes, 128)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Combine features
    x = tf.keras.layers.Concatenate()([x, label_embedding])
    
    # Dense layers with spectral norm
    x = SpectralNormalization(tf.keras.layers.Dense(1024))(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    x = SpectralNormalization(tf.keras.layers.Dense(512))(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    # Output layers
    validity = tf.keras.layers.Dense(1, activation='sigmoid', name='validity')(x)
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(x)
    
    model = tf.keras.Model(
        inputs=[img_input, label_input],
        outputs=[validity, label_pred],
        name='enhanced_cgan_discriminator'
    )
    
    return model

# Build Enhanced CGAN models
enhanced_cgan_generator = build_enhanced_cgan_generator(
    latent_dim=100, 
    num_classes=num_classes, 
    img_height=28, 
    img_width=28
)

enhanced_cgan_discriminator = build_enhanced_cgan_discriminator(
    img_height=28, 
    img_width=28, 
    num_classes=num_classes
)

# Display model architectures
print("\nEnhanced CGAN Generator Architecture:")
enhanced_cgan_generator.summary()

print("\nEnhanced CGAN Discriminator Architecture:")
enhanced_cgan_discriminator.summary()
🔨 Building Enhanced CGAN Generator...

Enhanced CGAN Generator Architecture:
Model: "enhanced_cgan_generator"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 label_input (InputLayer)       [(None,)]            0           []                               
                                                                                                  
 embedding_14 (Embedding)       (None, 100)          1600        ['label_input[0][0]']            
                                                                                                  
 noise_input (InputLayer)       [(None, 100)]        0           []                               
                                                                                                  
 flatten_22 (Flatten)           (None, 100)          0           ['embedding_14[0][0]']           
                                                                                                  
 concatenate_16 (Concatenate)   (None, 200)          0           ['noise_input[0][0]',            
                                                                  'flatten_22[0][0]']             
                                                                                                  
 dense_17 (Dense)               (None, 25088)        5017600     ['concatenate_16[0][0]']         
                                                                                                  
 batch_normalization_134 (Batch  (None, 25088)       100352      ['dense_17[0][0]']               
 Normalization)                                                                                   
                                                                                                  
 re_lu_17 (ReLU)                (None, 25088)        0           ['batch_normalization_134[0][0]']
                                                                                                  
 reshape_9 (Reshape)            (None, 7, 7, 512)    0           ['re_lu_17[0][0]']               
                                                                                                  
 conv2d_transpose_24 (Conv2DTra  (None, 14, 14, 256)  2097152    ['reshape_9[0][0]']              
 nspose)                                                                                          
                                                                                                  
 batch_normalization_135 (Batch  (None, 14, 14, 256)  1024       ['conv2d_transpose_24[0][0]']    
 Normalization)                                                                                   
                                                                                                  
 fi_lm (FiLM)                   (None, 14, 14, 256)  68096       ['batch_normalization_135[0][0]',
                                                                  'label_input[0][0]']            
                                                                                                  
 re_lu_18 (ReLU)                (None, 14, 14, 256)  0           ['fi_lm[0][0]']                  
                                                                                                  
 self_attention (SelfAttention)  (None, 14, 14, 256)  82241      ['re_lu_18[0][0]']               
                                                                                                  
 conv2d_transpose_25 (Conv2DTra  (None, 28, 28, 128)  524288     ['self_attention[0][0]']         
 nspose)                                                                                          
                                                                                                  
 batch_normalization_136 (Batch  (None, 28, 28, 128)  512        ['conv2d_transpose_25[0][0]']    
 Normalization)                                                                                   
                                                                                                  
 fi_lm_1 (FiLM)                 (None, 28, 28, 128)  35072       ['batch_normalization_136[0][0]',
                                                                  'label_input[0][0]']            
                                                                                                  
 re_lu_19 (ReLU)                (None, 28, 28, 128)  0           ['fi_lm_1[0][0]']                
                                                                                                  
 conv2d_129 (Conv2D)            (None, 28, 28, 1)    1153        ['re_lu_19[0][0]']               
                                                                                                  
==================================================================================================
Total params: 7,929,090
Trainable params: 7,878,146
Non-trainable params: 50,944
__________________________________________________________________________________________________

Enhanced CGAN Discriminator Architecture:
Model: "enhanced_cgan_discriminator"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 img_input (InputLayer)         [(None, 28, 28, 1)]  0           []                               
                                                                                                  
 spectral_normalization_3 (Spec  (None, 14, 14, 64)  1152        ['img_input[0][0]']              
 tralNormalization)                                                                               
                                                                                                  
 leaky_re_lu_36 (LeakyReLU)     (None, 14, 14, 64)   0           ['spectral_normalization_3[0][0]'
                                                                 ]                                
                                                                                                  
 dropout_26 (Dropout)           (None, 14, 14, 64)   0           ['leaky_re_lu_36[0][0]']         
                                                                                                  
 spectral_normalization_4 (Spec  (None, 7, 7, 128)   131328      ['dropout_26[0][0]']             
 tralNormalization)                                                                               
                                                                                                  
 batch_normalization_137 (Batch  (None, 7, 7, 128)   512         ['spectral_normalization_4[0][0]'
 Normalization)                                                  ]                                
                                                                                                  
 leaky_re_lu_37 (LeakyReLU)     (None, 7, 7, 128)    0           ['batch_normalization_137[0][0]']
                                                                                                  
 dropout_27 (Dropout)           (None, 7, 7, 128)    0           ['leaky_re_lu_37[0][0]']         
                                                                                                  
 spectral_normalization_5 (Spec  (None, 4, 4, 256)   524800      ['dropout_27[0][0]']             
 tralNormalization)                                                                               
                                                                                                  
 batch_normalization_138 (Batch  (None, 4, 4, 256)   1024        ['spectral_normalization_5[0][0]'
 Normalization)                                                  ]                                
                                                                                                  
 leaky_re_lu_38 (LeakyReLU)     (None, 4, 4, 256)    0           ['batch_normalization_138[0][0]']
                                                                                                  
 self_attention_1 (SelfAttentio  (None, 4, 4, 256)   82241       ['leaky_re_lu_38[0][0]']         
 n)                                                                                               
                                                                                                  
 dropout_28 (Dropout)           (None, 4, 4, 256)    0           ['self_attention_1[0][0]']       
                                                                                                  
 spectral_normalization_6 (Spec  (None, 2, 2, 512)   2098176     ['dropout_28[0][0]']             
 tralNormalization)                                                                               
                                                                                                  
 batch_normalization_139 (Batch  (None, 2, 2, 512)   2048        ['spectral_normalization_6[0][0]'
 Normalization)                                                  ]                                
                                                                                                  
 leaky_re_lu_39 (LeakyReLU)     (None, 2, 2, 512)    0           ['batch_normalization_139[0][0]']
                                                                                                  
 label_input (InputLayer)       [(None,)]            0           []                               
                                                                                                  
 dropout_29 (Dropout)           (None, 2, 2, 512)    0           ['leaky_re_lu_39[0][0]']         
                                                                                                  
 embedding_15 (Embedding)       (None, 128)          2048        ['label_input[0][0]']            
                                                                                                  
 flatten_23 (Flatten)           (None, 2048)         0           ['dropout_29[0][0]']             
                                                                                                  
 flatten_24 (Flatten)           (None, 128)          0           ['embedding_15[0][0]']           
                                                                                                  
 concatenate_17 (Concatenate)   (None, 2176)         0           ['flatten_23[0][0]',             
                                                                  'flatten_24[0][0]']             
                                                                                                  
 spectral_normalization_7 (Spec  (None, 1024)        2230272     ['concatenate_17[0][0]']         
 tralNormalization)                                                                               
                                                                                                  
 leaky_re_lu_40 (LeakyReLU)     (None, 1024)         0           ['spectral_normalization_7[0][0]'
                                                                 ]                                
                                                                                                  
 dropout_30 (Dropout)           (None, 1024)         0           ['leaky_re_lu_40[0][0]']         
                                                                                                  
 spectral_normalization_8 (Spec  (None, 512)         525312      ['dropout_30[0][0]']             
 tralNormalization)                                                                               
                                                                                                  
 leaky_re_lu_41 (LeakyReLU)     (None, 512)          0           ['spectral_normalization_8[0][0]'
                                                                 ]                                
                                                                                                  
 dropout_31 (Dropout)           (None, 512)          0           ['leaky_re_lu_41[0][0]']         
                                                                                                  
 validity (Dense)               (None, 1)            513         ['dropout_31[0][0]']             
                                                                                                  
 label_pred (Dense)             (None, 16)           8208        ['dropout_31[0][0]']             
                                                                                                  
==================================================================================================
Total params: 5,607,634
Trainable params: 5,603,346
Non-trainable params: 4,288
__________________________________________________________________________________________________
In [35]:
# IMPROVED ENHANCED CGAN MODELS - BETTER ARCHITECTURE

# Custom layers for better performance
class SpectralNormalization(tf.keras.layers.Wrapper):
    def __init__(self, layer, **kwargs):
        super(SpectralNormalization, self).__init__(layer, **kwargs)
        
    def build(self, input_shape):
        self.layer.build(input_shape)
        self.w = self.layer.kernel
        self.u = self.add_weight(
            shape=(1, int(self.w.shape[-1])),
            initializer='random_normal',
            trainable=False,
            name='u'
        )
        super(SpectralNormalization, self).build(input_shape)
        
    def call(self, inputs):
        # Power iteration for spectral normalization
        w_reshaped = tf.reshape(self.w, [-1, int(self.w.shape[-1])])
        v = tf.nn.l2_normalize(tf.matmul(self.u, tf.transpose(w_reshaped)))
        u = tf.nn.l2_normalize(tf.matmul(v, w_reshaped))
        
        sigma = tf.matmul(tf.matmul(v, w_reshaped), tf.transpose(u))
        w_norm = self.w / sigma
        
        # Update u
        self.u.assign(u)
        
        # Replace kernel with normalized version
        original_kernel = self.layer.kernel
        self.layer.kernel = w_norm
        output = self.layer(inputs)
        self.layer.kernel = original_kernel
        
        return output

def build_enhanced_cgan_generator(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """
    Build Enhanced CGAN generator with improved architecture:
    - Better label conditioning with FiLM
    - Residual connections
    - Self-attention
    - Progressive upsampling
    """    
    # Noise input
    noise_input = tf.keras.layers.Input(shape=(latent_dim,), name='noise_input')
    
    # Label input with improved embedding
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    label_embedding = tf.keras.layers.Embedding(num_classes, 128)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    label_embedding = tf.keras.layers.Dense(latent_dim, activation='relu')(label_embedding)
    
    # Feature-wise Linear Modulation (FiLM)
    gamma = tf.keras.layers.Dense(latent_dim, activation='sigmoid')(label_embedding)
    beta = tf.keras.layers.Dense(latent_dim)(label_embedding)
    
    # Apply FiLM conditioning
    conditioned_noise = tf.keras.layers.Multiply()([noise_input, gamma])
    conditioned_noise = tf.keras.layers.Add()([conditioned_noise, beta])
    
    # Add residual connection
    combined_input = tf.keras.layers.Add()([noise_input, conditioned_noise])
    
    # Dense layer to create initial feature map
    x = tf.keras.layers.Dense(7 * 7 * 512, use_bias=False)(combined_input)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    x = tf.keras.layers.Reshape((7, 7, 512))(x)
    x = tf.keras.layers.Dropout(0.2)(x)
    
    # Residual block at 7x7
    residual = x
    x = tf.keras.layers.Conv2D(512, 3, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    x = tf.keras.layers.Conv2D(512, 3, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Add()([x, residual])
    x = tf.keras.layers.ReLU()(x)
    
    # Self-attention at 7x7 resolution
    # Simplified self-attention
    attention_input = tf.keras.layers.Reshape((49, 512))(x)
    query = tf.keras.layers.Dense(128)(attention_input)
    key = tf.keras.layers.Dense(128)(attention_input)
    value = tf.keras.layers.Dense(128)(attention_input)
    
    # Attention mechanism
    attention_scores = tf.keras.layers.Lambda(
        lambda x: tf.matmul(x[0], x[1], transpose_b=True) / tf.sqrt(128.0)
    )([query, key])
    attention_weights = tf.keras.layers.Activation('softmax')(attention_scores)
    attention_output = tf.keras.layers.Lambda(
        lambda x: tf.matmul(x[0], x[1])
    )([attention_weights, value])
    
    # Project back and add residual
    attention_output = tf.keras.layers.Dense(512)(attention_output)
    attention_output = tf.keras.layers.Add()([attention_input, attention_output])
    x = tf.keras.layers.Reshape((7, 7, 512))(attention_output)
    
    # First upsampling: 7x7x512 -> 14x14x256
    x = tf.keras.layers.Conv2DTranspose(256, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Residual block at 14x14
    residual = x
    x = tf.keras.layers.Conv2D(256, 3, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    x = tf.keras.layers.Conv2D(256, 3, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Add()([x, residual])
    x = tf.keras.layers.ReLU()(x)
    
    # Second upsampling: 14x14x256 -> 28x28x128
    x = tf.keras.layers.Conv2DTranspose(128, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Final conditioning with label
    label_broadcast = tf.keras.layers.Dense(128)(label_embedding)
    label_broadcast = tf.keras.layers.Reshape((1, 1, 128))(label_broadcast)
    label_broadcast = tf.keras.layers.Lambda(lambda x: tf.tile(x, [1, 28, 28, 1]))(label_broadcast)
    
    # Feature modulation
    x_gamma = tf.keras.layers.Conv2D(128, 1, activation='sigmoid', use_bias=False)(label_broadcast)
    x_beta = tf.keras.layers.Conv2D(128, 1, use_bias=False)(label_broadcast)
    x = tf.keras.layers.Multiply()([x, x_gamma])
    x = tf.keras.layers.Add()([x, x_beta])
    
    # Final layers
    x = tf.keras.layers.Conv2D(64, 3, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.ReLU()(x)
    
    # Output layer
    x = tf.keras.layers.Conv2D(1, 3, padding='same', activation='tanh')(x)
    
    model = tf.keras.Model(
        inputs=[noise_input, label_input],
        outputs=x,
        name='enhanced_cgan_generator'
    )
    
    return model

def build_enhanced_cgan_discriminator(img_height=28, img_width=28, num_classes=16):
    """
    Build Enhanced CGAN discriminator with improved architecture:
    - Spectral normalization
    - Self-attention
    - Better label conditioning
    - Progressive downsampling
    """    
    # Image input
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1), name='img_input')
    label_input = tf.keras.layers.Input(shape=(), dtype='int32', name='label_input')
    
    # Process label with spatial conditioning
    label_embedding = tf.keras.layers.Embedding(num_classes, 64)(label_input)
    label_embedding = tf.keras.layers.Dense(28*28)(label_embedding)
    label_embedding = tf.keras.layers.Reshape((28, 28, 1))(label_embedding)
    
    # Concatenate image and label map
    x = tf.keras.layers.Concatenate()([img_input, label_embedding])
    
    # First conv block: 28x28x2 -> 14x14x64
    x = SpectralNormalization(tf.keras.layers.Conv2D(64, 4, strides=2, padding='same'))(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.2)(x)
    
    # Second conv block: 14x14x64 -> 7x7x128
    x = SpectralNormalization(tf.keras.layers.Conv2D(128, 4, strides=2, padding='same'))(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    # Self-attention at 7x7 resolution
    attention_input = tf.keras.layers.Reshape((49, 128))(x)
    query = tf.keras.layers.Dense(64)(attention_input)
    key = tf.keras.layers.Dense(64)(attention_input)
    value = tf.keras.layers.Dense(64)(attention_input)
    
    # Attention mechanism
    attention_scores = tf.keras.layers.Lambda(
        lambda x: tf.matmul(x[0], x[1], transpose_b=True) / tf.sqrt(64.0)
    )([query, key])
    attention_weights = tf.keras.layers.Activation('softmax')(attention_scores)
    attention_output = tf.keras.layers.Lambda(
        lambda x: tf.matmul(x[0], x[1])
    )([attention_weights, value])
    
    # Project back and add residual
    attention_output = tf.keras.layers.Dense(128)(attention_output)
    attention_output = tf.keras.layers.Add()([attention_input, attention_output])
    x = tf.keras.layers.Reshape((7, 7, 128))(attention_output)
    
    # Third conv block: 7x7x128 -> 4x4x256
    x = SpectralNormalization(tf.keras.layers.Conv2D(256, 4, strides=1, padding='valid'))(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.4)(x)
    
    # Fourth conv block: 4x4x256 -> 2x2x512
    x = SpectralNormalization(tf.keras.layers.Conv2D(512, 3, strides=2, padding='same'))(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    # Global average pooling
    x = tf.keras.layers.GlobalAveragePooling2D()(x)
    
    # Dense layers with better architecture
    x = SpectralNormalization(tf.keras.layers.Dense(512))(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)
    
    # Separate feature extraction for different tasks
    validity_features = tf.keras.layers.Dense(256, activation='relu')(x)
    class_features = tf.keras.layers.Dense(256, activation='relu')(x)
    
    # Output layers
    validity = tf.keras.layers.Dense(1, activation='sigmoid', name='validity')(validity_features)
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(class_features)
    
    model = tf.keras.Model(
        inputs=[img_input, label_input],
        outputs=[validity, label_pred],
        name='enhanced_cgan_discriminator'
    )
    
    return model

# Build Enhanced CGAN models with improved architecture
enhanced_cgan_generator = build_enhanced_cgan_generator(
    latent_dim=100, 
    num_classes=num_classes, 
    img_height=28, 
    img_width=28
)

enhanced_cgan_discriminator = build_enhanced_cgan_discriminator(
    img_height=28, 
    img_width=28, 
    num_classes=num_classes
)


# Display model architectures
print("\nEnhanced CGAN Generator Architecture:")
enhanced_cgan_generator.summary()

print("\nEnhanced CGAN Discriminator Architecture:")
enhanced_cgan_discriminator.summary()
Enhanced CGAN Generator Architecture:
Model: "enhanced_cgan_generator"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 label_input (InputLayer)       [(None,)]            0           []                               
                                                                                                  
 embedding_16 (Embedding)       (None, 128)          2048        ['label_input[0][0]']            
                                                                                                  
 flatten_25 (Flatten)           (None, 128)          0           ['embedding_16[0][0]']           
                                                                                                  
 dense_20 (Dense)               (None, 100)          12900       ['flatten_25[0][0]']             
                                                                                                  
 noise_input (InputLayer)       [(None, 100)]        0           []                               
                                                                                                  
 dense_21 (Dense)               (None, 100)          10100       ['dense_20[0][0]']               
                                                                                                  
 multiply (Multiply)            (None, 100)          0           ['noise_input[0][0]',            
                                                                  'dense_21[0][0]']               
                                                                                                  
 dense_22 (Dense)               (None, 100)          10100       ['dense_20[0][0]']               
                                                                                                  
 add (Add)                      (None, 100)          0           ['multiply[0][0]',               
                                                                  'dense_22[0][0]']               
                                                                                                  
 add_1 (Add)                    (None, 100)          0           ['noise_input[0][0]',            
                                                                  'add[0][0]']                    
                                                                                                  
 dense_23 (Dense)               (None, 25088)        2508800     ['add_1[0][0]']                  
                                                                                                  
 batch_normalization_140 (Batch  (None, 25088)       100352      ['dense_23[0][0]']               
 Normalization)                                                                                   
                                                                                                  
 re_lu_20 (ReLU)                (None, 25088)        0           ['batch_normalization_140[0][0]']
                                                                                                  
 reshape_10 (Reshape)           (None, 7, 7, 512)    0           ['re_lu_20[0][0]']               
                                                                                                  
 dropout_32 (Dropout)           (None, 7, 7, 512)    0           ['reshape_10[0][0]']             
                                                                                                  
 conv2d_134 (Conv2D)            (None, 7, 7, 512)    2359296     ['dropout_32[0][0]']             
                                                                                                  
 batch_normalization_141 (Batch  (None, 7, 7, 512)   2048        ['conv2d_134[0][0]']             
 Normalization)                                                                                   
                                                                                                  
 re_lu_21 (ReLU)                (None, 7, 7, 512)    0           ['batch_normalization_141[0][0]']
                                                                                                  
 conv2d_135 (Conv2D)            (None, 7, 7, 512)    2359296     ['re_lu_21[0][0]']               
                                                                                                  
 batch_normalization_142 (Batch  (None, 7, 7, 512)   2048        ['conv2d_135[0][0]']             
 Normalization)                                                                                   
                                                                                                  
 add_2 (Add)                    (None, 7, 7, 512)    0           ['batch_normalization_142[0][0]',
                                                                  'dropout_32[0][0]']             
                                                                                                  
 re_lu_22 (ReLU)                (None, 7, 7, 512)    0           ['add_2[0][0]']                  
                                                                                                  
 reshape_11 (Reshape)           (None, 49, 512)      0           ['re_lu_22[0][0]']               
                                                                                                  
 dense_24 (Dense)               (None, 49, 128)      65664       ['reshape_11[0][0]']             
                                                                                                  
 dense_25 (Dense)               (None, 49, 128)      65664       ['reshape_11[0][0]']             
                                                                                                  
 lambda (Lambda)                (None, 49, 49)       0           ['dense_24[0][0]',               
                                                                  'dense_25[0][0]']               
                                                                                                  
 activation_94 (Activation)     (None, 49, 49)       0           ['lambda[0][0]']                 
                                                                                                  
 dense_26 (Dense)               (None, 49, 128)      65664       ['reshape_11[0][0]']             
                                                                                                  
 lambda_1 (Lambda)              (None, 49, 128)      0           ['activation_94[0][0]',          
                                                                  'dense_26[0][0]']               
                                                                                                  
 dense_27 (Dense)               (None, 49, 512)      66048       ['lambda_1[0][0]']               
                                                                                                  
 add_3 (Add)                    (None, 49, 512)      0           ['reshape_11[0][0]',             
                                                                  'dense_27[0][0]']               
                                                                                                  
 reshape_12 (Reshape)           (None, 7, 7, 512)    0           ['add_3[0][0]']                  
                                                                                                  
 conv2d_transpose_26 (Conv2DTra  (None, 14, 14, 256)  3276800    ['reshape_12[0][0]']             
 nspose)                                                                                          
                                                                                                  
 batch_normalization_143 (Batch  (None, 14, 14, 256)  1024       ['conv2d_transpose_26[0][0]']    
 Normalization)                                                                                   
                                                                                                  
 re_lu_23 (ReLU)                (None, 14, 14, 256)  0           ['batch_normalization_143[0][0]']
                                                                                                  
 conv2d_136 (Conv2D)            (None, 14, 14, 256)  589824      ['re_lu_23[0][0]']               
                                                                                                  
 batch_normalization_144 (Batch  (None, 14, 14, 256)  1024       ['conv2d_136[0][0]']             
 Normalization)                                                                                   
                                                                                                  
 re_lu_24 (ReLU)                (None, 14, 14, 256)  0           ['batch_normalization_144[0][0]']
                                                                                                  
 conv2d_137 (Conv2D)            (None, 14, 14, 256)  589824      ['re_lu_24[0][0]']               
                                                                                                  
 batch_normalization_145 (Batch  (None, 14, 14, 256)  1024       ['conv2d_137[0][0]']             
 Normalization)                                                                                   
                                                                                                  
 add_4 (Add)                    (None, 14, 14, 256)  0           ['batch_normalization_145[0][0]',
                                                                  're_lu_23[0][0]']               
                                                                                                  
 re_lu_25 (ReLU)                (None, 14, 14, 256)  0           ['add_4[0][0]']                  
                                                                                                  
 dense_28 (Dense)               (None, 128)          12928       ['dense_20[0][0]']               
                                                                                                  
 conv2d_transpose_27 (Conv2DTra  (None, 28, 28, 128)  819200     ['re_lu_25[0][0]']               
 nspose)                                                                                          
                                                                                                  
 reshape_13 (Reshape)           (None, 1, 1, 128)    0           ['dense_28[0][0]']               
                                                                                                  
 batch_normalization_146 (Batch  (None, 28, 28, 128)  512        ['conv2d_transpose_27[0][0]']    
 Normalization)                                                                                   
                                                                                                  
 lambda_2 (Lambda)              (None, 28, 28, 128)  0           ['reshape_13[0][0]']             
                                                                                                  
 re_lu_26 (ReLU)                (None, 28, 28, 128)  0           ['batch_normalization_146[0][0]']
                                                                                                  
 conv2d_138 (Conv2D)            (None, 28, 28, 128)  16384       ['lambda_2[0][0]']               
                                                                                                  
 multiply_1 (Multiply)          (None, 28, 28, 128)  0           ['re_lu_26[0][0]',               
                                                                  'conv2d_138[0][0]']             
                                                                                                  
 conv2d_139 (Conv2D)            (None, 28, 28, 128)  16384       ['lambda_2[0][0]']               
                                                                                                  
 add_5 (Add)                    (None, 28, 28, 128)  0           ['multiply_1[0][0]',             
                                                                  'conv2d_139[0][0]']             
                                                                                                  
 conv2d_140 (Conv2D)            (None, 28, 28, 64)   73728       ['add_5[0][0]']                  
                                                                                                  
 batch_normalization_147 (Batch  (None, 28, 28, 64)  256         ['conv2d_140[0][0]']             
 Normalization)                                                                                   
                                                                                                  
 re_lu_27 (ReLU)                (None, 28, 28, 64)   0           ['batch_normalization_147[0][0]']
                                                                                                  
 conv2d_141 (Conv2D)            (None, 28, 28, 1)    577         ['re_lu_27[0][0]']               
                                                                                                  
==================================================================================================
Total params: 13,029,517
Trainable params: 12,975,373
Non-trainable params: 54,144
__________________________________________________________________________________________________

Enhanced CGAN Discriminator Architecture:
Model: "enhanced_cgan_discriminator"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 label_input (InputLayer)       [(None,)]            0           []                               
                                                                                                  
 embedding_17 (Embedding)       (None, 64)           1024        ['label_input[0][0]']            
                                                                                                  
 dense_29 (Dense)               (None, 784)          50960       ['embedding_17[0][0]']           
                                                                                                  
 img_input (InputLayer)         [(None, 28, 28, 1)]  0           []                               
                                                                                                  
 reshape_14 (Reshape)           (None, 28, 28, 1)    0           ['dense_29[0][0]']               
                                                                                                  
 concatenate_18 (Concatenate)   (None, 28, 28, 2)    0           ['img_input[0][0]',              
                                                                  'reshape_14[0][0]']             
                                                                                                  
 spectral_normalization_9 (Spec  (None, 14, 14, 64)  2176        ['concatenate_18[0][0]']         
 tralNormalization)                                                                               
                                                                                                  
 leaky_re_lu_42 (LeakyReLU)     (None, 14, 14, 64)   0           ['spectral_normalization_9[0][0]'
                                                                 ]                                
                                                                                                  
 dropout_33 (Dropout)           (None, 14, 14, 64)   0           ['leaky_re_lu_42[0][0]']         
                                                                                                  
 spectral_normalization_10 (Spe  (None, 7, 7, 128)   131328      ['dropout_33[0][0]']             
 ctralNormalization)                                                                              
                                                                                                  
 batch_normalization_148 (Batch  (None, 7, 7, 128)   512         ['spectral_normalization_10[0][0]
 Normalization)                                                  ']                               
                                                                                                  
 leaky_re_lu_43 (LeakyReLU)     (None, 7, 7, 128)    0           ['batch_normalization_148[0][0]']
                                                                                                  
 dropout_34 (Dropout)           (None, 7, 7, 128)    0           ['leaky_re_lu_43[0][0]']         
                                                                                                  
 reshape_15 (Reshape)           (None, 49, 128)      0           ['dropout_34[0][0]']             
                                                                                                  
 dense_30 (Dense)               (None, 49, 64)       8256        ['reshape_15[0][0]']             
                                                                                                  
 dense_31 (Dense)               (None, 49, 64)       8256        ['reshape_15[0][0]']             
                                                                                                  
 lambda_3 (Lambda)              (None, 49, 49)       0           ['dense_30[0][0]',               
                                                                  'dense_31[0][0]']               
                                                                                                  
 activation_95 (Activation)     (None, 49, 49)       0           ['lambda_3[0][0]']               
                                                                                                  
 dense_32 (Dense)               (None, 49, 64)       8256        ['reshape_15[0][0]']             
                                                                                                  
 lambda_4 (Lambda)              (None, 49, 64)       0           ['activation_95[0][0]',          
                                                                  'dense_32[0][0]']               
                                                                                                  
 dense_33 (Dense)               (None, 49, 128)      8320        ['lambda_4[0][0]']               
                                                                                                  
 add_6 (Add)                    (None, 49, 128)      0           ['reshape_15[0][0]',             
                                                                  'dense_33[0][0]']               
                                                                                                  
 reshape_16 (Reshape)           (None, 7, 7, 128)    0           ['add_6[0][0]']                  
                                                                                                  
 spectral_normalization_11 (Spe  (None, 4, 4, 256)   524800      ['reshape_16[0][0]']             
 ctralNormalization)                                                                              
                                                                                                  
 batch_normalization_149 (Batch  (None, 4, 4, 256)   1024        ['spectral_normalization_11[0][0]
 Normalization)                                                  ']                               
                                                                                                  
 leaky_re_lu_44 (LeakyReLU)     (None, 4, 4, 256)    0           ['batch_normalization_149[0][0]']
                                                                                                  
 dropout_35 (Dropout)           (None, 4, 4, 256)    0           ['leaky_re_lu_44[0][0]']         
                                                                                                  
 spectral_normalization_12 (Spe  (None, 2, 2, 512)   1180672     ['dropout_35[0][0]']             
 ctralNormalization)                                                                              
                                                                                                  
 batch_normalization_150 (Batch  (None, 2, 2, 512)   2048        ['spectral_normalization_12[0][0]
 Normalization)                                                  ']                               
                                                                                                  
 leaky_re_lu_45 (LeakyReLU)     (None, 2, 2, 512)    0           ['batch_normalization_150[0][0]']
                                                                                                  
 dropout_36 (Dropout)           (None, 2, 2, 512)    0           ['leaky_re_lu_45[0][0]']         
                                                                                                  
 global_average_pooling2d_1 (Gl  (None, 512)         0           ['dropout_36[0][0]']             
 obalAveragePooling2D)                                                                            
                                                                                                  
 spectral_normalization_13 (Spe  (None, 512)         263168      ['global_average_pooling2d_1[0][0
 ctralNormalization)                                             ]']                              
                                                                                                  
 leaky_re_lu_46 (LeakyReLU)     (None, 512)          0           ['spectral_normalization_13[0][0]
                                                                 ']                               
                                                                                                  
 dropout_37 (Dropout)           (None, 512)          0           ['leaky_re_lu_46[0][0]']         
                                                                                                  
 dense_35 (Dense)               (None, 256)          131328      ['dropout_37[0][0]']             
                                                                                                  
 dense_36 (Dense)               (None, 256)          131328      ['dropout_37[0][0]']             
                                                                                                  
 validity (Dense)               (None, 1)            257         ['dense_35[0][0]']               
                                                                                                  
 label_pred (Dense)             (None, 16)           4112        ['dense_36[0][0]']               
                                                                                                  
==================================================================================================
Total params: 2,457,825
Trainable params: 2,454,561
Non-trainable params: 3,264
__________________________________________________________________________________________________
In [ ]:
# =============================================================================
# ENHANCED CGAN TRAINING - COMPLETE 50 EPOCH TRAINING
# =============================================================================

# Test visualization first
print("Testing visualization with current Enhanced CGAN models...")
test_images_enhanced, test_labels_enhanced = display_generated_samples_grid(
    enhanced_cgan_generator, class_to_letter, samples_per_class=6
)

# Training configuration
NUM_EPOCHS = 50


# Set up optimizers and loss functions
enhanced_gen_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5)
enhanced_disc_optimizer = tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5)

enhanced_bce_loss = tf.keras.losses.BinaryCrossentropy(from_logits=False)
enhanced_categorical_loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)

# Training history
enhanced_gen_losses = []
enhanced_disc_losses = []
enhanced_label_accuracies = []
enhanced_epochs_list = []

@tf.function
def enhanced_train_step(real_images, real_labels, batch_size):
    # Train discriminator
    noise = tf.random.normal([batch_size, 100])
    fake_labels = tf.random.uniform([batch_size], 0, num_classes, dtype=tf.int32)
    
    with tf.GradientTape() as disc_tape:
        # Generate fake images
        fake_images = enhanced_cgan_generator([noise, fake_labels], training=True)
        
        # Real images
        real_validity, real_label_pred = enhanced_cgan_discriminator([real_images, real_labels], training=True)
        
        # Fake images
        fake_validity, fake_label_pred = enhanced_cgan_discriminator([fake_images, fake_labels], training=True)
        
        # Adversarial losses
        real_loss = enhanced_bce_loss(tf.ones_like(real_validity), real_validity)
        fake_loss = enhanced_bce_loss(tf.zeros_like(fake_validity), fake_validity)
        
        # Label classification losses
        real_label_loss = enhanced_categorical_loss(real_labels, real_label_pred)
        fake_label_loss = enhanced_categorical_loss(fake_labels, fake_label_pred)
        
        # Total discriminator loss
        disc_loss = (real_loss + fake_loss) / 2 + (real_label_loss + fake_label_loss) / 2
    
    # Update discriminator
    gradients = disc_tape.gradient(disc_loss, enhanced_cgan_discriminator.trainable_variables)
    enhanced_disc_optimizer.apply_gradients(zip(gradients, enhanced_cgan_discriminator.trainable_variables))
    
    # Train generator
    noise = tf.random.normal([batch_size, 100])
    fake_labels = tf.random.uniform([batch_size], 0, num_classes, dtype=tf.int32)
    
    with tf.GradientTape() as gen_tape:
        fake_images = enhanced_cgan_generator([noise, fake_labels], training=True)
        fake_validity, fake_label_pred = enhanced_cgan_discriminator([fake_images, fake_labels], training=True)
        
        # Generator wants discriminator to classify fake images as real
        adversarial_loss = enhanced_bce_loss(tf.ones_like(fake_validity), fake_validity)
        
        # Generator wants correct label classification
        label_loss = enhanced_categorical_loss(fake_labels, fake_label_pred)
        
        # Total generator loss
        gen_loss = adversarial_loss + label_loss
    
    # Update generator
    gradients = gen_tape.gradient(gen_loss, enhanced_cgan_generator.trainable_variables)
    enhanced_gen_optimizer.apply_gradients(zip(gradients, enhanced_cgan_generator.trainable_variables))
    
    # Calculate accuracy
    accuracy = tf.keras.metrics.sparse_categorical_accuracy(real_labels, real_label_pred)
    
    return gen_loss, disc_loss, tf.reduce_mean(accuracy)

start_time = time.time()

for epoch in range(NUM_EPOCHS):
    epoch_start = time.time()
    
    print(f"\nEpoch {epoch + 1}")
    print("-" * 60)
    
    # Reset metrics
    epoch_gen_losses = []
    epoch_disc_losses = []
    epoch_accuracies = []
    
    # Training loop
    for step, (real_images, real_labels) in enumerate(train_dataset.take(steps_per_epoch)):
        batch_size = tf.shape(real_images)[0]
        
        gen_loss, disc_loss, accuracy = enhanced_train_step(real_images, real_labels, batch_size)
        
        epoch_gen_losses.append(float(gen_loss))
        epoch_disc_losses.append(float(disc_loss))
        epoch_accuracies.append(float(accuracy))
        
        # Print progress
        if step % 100 == 0:
            print(f"Step {step:4d}/{steps_per_epoch} - "
                  f"Gen Loss: {gen_loss:.4f}, "
                  f"Disc Loss: {disc_loss:.4f}, "
                  f"Accuracy: {accuracy:.4f}")
    
    # Store epoch averages
    avg_gen_loss = np.mean(epoch_gen_losses)
    avg_disc_loss = np.mean(epoch_disc_losses)
    avg_accuracy = np.mean(epoch_accuracies)
    
    enhanced_gen_losses.append(avg_gen_loss)
    enhanced_disc_losses.append(avg_disc_loss)
    enhanced_label_accuracies.append(avg_accuracy)
    enhanced_epochs_list.append(epoch)
    
    # Display only at final epoch
    if (epoch + 1) == NUM_EPOCHS:
        print(f"\nGenerating Grid of Samples at Final Epoch {epoch + 1}:")
        display_generated_samples_grid(enhanced_cgan_generator, class_to_letter, epoch + 1, samples_per_class=6)

    # Calculate and display epoch timing
    epoch_time = time.time() - epoch_start
    total_time = time.time() - start_time
    avg_time = total_time / (epoch + 1)
    eta = avg_time * (NUM_EPOCHS - epoch - 1)
    
total_training_time = time.time() - start_time


# Generate final display
print(f"\nFinal Generated Samples - All Letter Classes:")
final_images_enhanced_cgan, final_labels_enhanced_cgan = display_generated_samples_grid(
    enhanced_cgan_generator, class_to_letter, NUM_EPOCHS, samples_per_class=6
)


# Plot training progress
if len(enhanced_gen_losses) > 1:
    plt.figure(figsize=(15, 5))
    
    # Generator and Discriminator Loss
    plt.subplot(1, 3, 1)
    plt.plot(enhanced_epochs_list, enhanced_gen_losses, label='Generator Loss', color='blue', linewidth=2)
    plt.plot(enhanced_epochs_list, enhanced_disc_losses, label='Discriminator Loss', color='red', linewidth=2)
    plt.title('Enhanced CGAN - Training Losses', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Discriminator Accuracy
    plt.subplot(1, 3, 2)
    plt.plot(enhanced_epochs_list, enhanced_label_accuracies, label='Discriminator Accuracy', color='green', linewidth=2)
    plt.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='Upper limit (95%)')
    plt.axhline(y=0.70, color='orange', linestyle='--', alpha=0.7, label='Lower limit (70%)')
    plt.title('Enhanced CGAN - Discriminator Accuracy', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # Loss Difference
    plt.subplot(1, 3, 3)
    loss_diff = [g - d for g, d in zip(enhanced_gen_losses, enhanced_disc_losses)]
    plt.plot(enhanced_epochs_list, loss_diff, label='Gen Loss - Disc Loss', color='purple', linewidth=2)
    plt.axhline(y=0, color='black', linestyle='-', alpha=0.5)
    plt.title('Enhanced CGAN - Loss Balance', fontsize=14, fontweight='bold')
    plt.xlabel('Epoch')
    plt.ylabel('Loss Difference')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
🧪 Testing visualization with current Enhanced CGAN models...
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Epoch 1
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.5173, Disc Loss: 3.6070, Accuracy: 0.0312
Step  100/597 - Gen Loss: 3.4003, Disc Loss: 3.1927, Accuracy: 0.4219
Step  200/597 - Gen Loss: 1.7282, Disc Loss: 1.1165, Accuracy: 0.7812
Step  300/597 - Gen Loss: 1.3864, Disc Loss: 0.6041, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.0877, Disc Loss: 0.5931, Accuracy: 1.0000
Step  500/597 - Gen Loss: 0.8605, Disc Loss: 0.7227, Accuracy: 1.0000

Epoch 2
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.1273, Disc Loss: 0.6787, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.2812, Disc Loss: 0.6498, Accuracy: 1.0000
Step  200/597 - Gen Loss: 0.9144, Disc Loss: 0.7598, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.1499, Disc Loss: 0.5584, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.7918, Disc Loss: 0.6809, Accuracy: 0.9688
Step  500/597 - Gen Loss: 1.1205, Disc Loss: 0.6384, Accuracy: 1.0000

Epoch 3
------------------------------------------------------------
Step    0/597 - Gen Loss: 0.7676, Disc Loss: 0.5264, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.1277, Disc Loss: 0.5224, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.5479, Disc Loss: 0.5539, Accuracy: 0.9844
Step  300/597 - Gen Loss: 1.2643, Disc Loss: 0.5518, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.9343, Disc Loss: 0.5396, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.2440, Disc Loss: 0.4949, Accuracy: 1.0000

Epoch 4
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.9701, Disc Loss: 0.4319, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.4767, Disc Loss: 0.4012, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.5421, Disc Loss: 0.4883, Accuracy: 0.9844
Step  300/597 - Gen Loss: 1.6180, Disc Loss: 0.5304, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.0021, Disc Loss: 0.3320, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.9694, Disc Loss: 0.4919, Accuracy: 1.0000

Epoch 5
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.4250, Disc Loss: 0.4925, Accuracy: 0.9844
Step  100/597 - Gen Loss: 1.5523, Disc Loss: 0.5153, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.7143, Disc Loss: 0.5611, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.1574, Disc Loss: 0.4729, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.9389, Disc Loss: 0.5304, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.9665, Disc Loss: 0.3780, Accuracy: 1.0000

Epoch 6
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.0207, Disc Loss: 0.3195, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.9198, Disc Loss: 0.6084, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.3163, Disc Loss: 0.6110, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.8953, Disc Loss: 0.3889, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.1240, Disc Loss: 0.3612, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.2095, Disc Loss: 0.3588, Accuracy: 1.0000

Epoch 7
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.0291, Disc Loss: 0.4076, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.8896, Disc Loss: 0.7032, Accuracy: 1.0000
Step  200/597 - Gen Loss: 3.0583, Disc Loss: 0.3104, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.6153, Disc Loss: 0.3596, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.3768, Disc Loss: 0.5684, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.8141, Disc Loss: 0.3938, Accuracy: 1.0000

Epoch 8
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.0606, Disc Loss: 0.1862, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.1433, Disc Loss: 0.2174, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.6660, Disc Loss: 0.3517, Accuracy: 1.0000
Step  300/597 - Gen Loss: 3.7734, Disc Loss: 0.4987, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.9956, Disc Loss: 0.4104, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.2333, Disc Loss: 0.3234, Accuracy: 1.0000

Epoch 9
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.6445, Disc Loss: 0.1411, Accuracy: 1.0000
Step  100/597 - Gen Loss: 3.7127, Disc Loss: 0.1337, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.9119, Disc Loss: 0.1522, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.4363, Disc Loss: 0.1667, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.9090, Disc Loss: 0.3113, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.8398, Disc Loss: 0.4852, Accuracy: 1.0000

Epoch 10
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.0539, Disc Loss: 0.3002, Accuracy: 1.0000
Step  100/597 - Gen Loss: 3.0137, Disc Loss: 0.2368, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.2215, Disc Loss: 0.3706, Accuracy: 0.9844
Step  300/597 - Gen Loss: 1.4898, Disc Loss: 0.2763, Accuracy: 0.9844
Step  400/597 - Gen Loss: 3.2008, Disc Loss: 0.5443, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.2398, Disc Loss: 0.4379, Accuracy: 1.0000

Epoch 11
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.6807, Disc Loss: 0.1682, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.5675, Disc Loss: 0.4954, Accuracy: 0.9844
Step  200/597 - Gen Loss: 3.5406, Disc Loss: 0.1341, Accuracy: 0.9844
Step  300/597 - Gen Loss: 3.3687, Disc Loss: 0.1936, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.5326, Disc Loss: 0.2123, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.0379, Disc Loss: 0.1855, Accuracy: 0.9844

Epoch 12
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.5924, Disc Loss: 0.2647, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.4642, Disc Loss: 0.2278, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.7966, Disc Loss: 0.4795, Accuracy: 0.9844
Step  300/597 - Gen Loss: 2.4187, Disc Loss: 0.4431, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.8978, Disc Loss: 0.3571, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.7813, Disc Loss: 0.4986, Accuracy: 1.0000

Epoch 13
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.7945, Disc Loss: 0.2973, Accuracy: 0.9844
Step  100/597 - Gen Loss: 0.9426, Disc Loss: 0.4763, Accuracy: 0.9844
Step  200/597 - Gen Loss: 3.9400, Disc Loss: 0.2642, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.8731, Disc Loss: 0.4383, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.0224, Disc Loss: 0.1611, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.6543, Disc Loss: 0.3739, Accuracy: 0.9844

Epoch 14
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.1562, Disc Loss: 0.1876, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.6362, Disc Loss: 0.3381, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.1072, Disc Loss: 0.5863, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.3023, Disc Loss: 0.1837, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.2832, Disc Loss: 0.1110, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.6983, Disc Loss: 0.2084, Accuracy: 1.0000

Epoch 15
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.0222, Disc Loss: 0.1785, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.9287, Disc Loss: 0.5177, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.7636, Disc Loss: 0.2676, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.3188, Disc Loss: 0.2959, Accuracy: 0.9844
Step  400/597 - Gen Loss: 1.8751, Disc Loss: 0.1948, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.4373, Disc Loss: 0.2145, Accuracy: 1.0000

Epoch 16
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.8262, Disc Loss: 0.2547, Accuracy: 0.9844
Step  100/597 - Gen Loss: 1.9564, Disc Loss: 0.3791, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.9443, Disc Loss: 0.2967, Accuracy: 1.0000
Step  300/597 - Gen Loss: 3.0878, Disc Loss: 0.2061, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.6462, Disc Loss: 0.5735, Accuracy: 1.0000
Step  500/597 - Gen Loss: 4.3657, Disc Loss: 0.2535, Accuracy: 1.0000

Epoch 17
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.5600, Disc Loss: 0.2385, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.9906, Disc Loss: 0.2790, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.9574, Disc Loss: 0.1516, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.9709, Disc Loss: 0.5427, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.7090, Disc Loss: 0.1797, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.9163, Disc Loss: 0.1759, Accuracy: 1.0000

Epoch 18
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.0526, Disc Loss: 0.3836, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.9952, Disc Loss: 0.3715, Accuracy: 0.9844
Step  200/597 - Gen Loss: 2.8856, Disc Loss: 0.2652, Accuracy: 0.9688
Step  300/597 - Gen Loss: 2.5462, Disc Loss: 0.6305, Accuracy: 0.9844
Step  400/597 - Gen Loss: 2.7813, Disc Loss: 0.3632, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.4293, Disc Loss: 0.1532, Accuracy: 1.0000

Epoch 19
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.3220, Disc Loss: 0.2379, Accuracy: 0.9844
Step  100/597 - Gen Loss: 3.3764, Disc Loss: 0.3138, Accuracy: 1.0000
Step  200/597 - Gen Loss: 4.0235, Disc Loss: 0.1166, Accuracy: 1.0000
Step  300/597 - Gen Loss: 4.2511, Disc Loss: 0.1766, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.6929, Disc Loss: 0.0693, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.2505, Disc Loss: 0.0954, Accuracy: 1.0000

Epoch 20
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.2605, Disc Loss: 0.0316, Accuracy: 1.0000
Step  100/597 - Gen Loss: 4.4990, Disc Loss: 0.1302, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.1587, Disc Loss: 0.1577, Accuracy: 1.0000
Step  300/597 - Gen Loss: 0.9791, Disc Loss: 0.1687, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.0937, Disc Loss: 0.4438, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.6558, Disc Loss: 0.3267, Accuracy: 1.0000

Epoch 21
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.6431, Disc Loss: 0.5088, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.8833, Disc Loss: 0.4461, Accuracy: 1.0000
Step  200/597 - Gen Loss: 3.0055, Disc Loss: 0.2822, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.9958, Disc Loss: 0.5564, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.8277, Disc Loss: 0.3167, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.0861, Disc Loss: 0.6515, Accuracy: 0.9844

Epoch 22
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.1307, Disc Loss: 0.3331, Accuracy: 1.0000
Step  100/597 - Gen Loss: 3.1153, Disc Loss: 0.3616, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.8228, Disc Loss: 0.2130, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.0607, Disc Loss: 0.2705, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.4425, Disc Loss: 0.3991, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.7961, Disc Loss: 0.4043, Accuracy: 1.0000

Epoch 23
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.6344, Disc Loss: 0.2638, Accuracy: 1.0000
Step  100/597 - Gen Loss: 3.9751, Disc Loss: 0.5934, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.7700, Disc Loss: 0.2411, Accuracy: 1.0000
Step  300/597 - Gen Loss: 5.7474, Disc Loss: 0.3168, Accuracy: 0.9844
Step  400/597 - Gen Loss: 2.0676, Disc Loss: 0.2760, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.4545, Disc Loss: 0.4805, Accuracy: 1.0000

Epoch 24
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.7278, Disc Loss: 0.3764, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.2783, Disc Loss: 0.3458, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.3791, Disc Loss: 0.3939, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.7394, Disc Loss: 0.5762, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.0883, Disc Loss: 0.2468, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.8437, Disc Loss: 0.5771, Accuracy: 1.0000

Epoch 25
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.2537, Disc Loss: 0.2439, Accuracy: 1.0000
Step  100/597 - Gen Loss: 4.3684, Disc Loss: 0.1022, Accuracy: 1.0000
Step  200/597 - Gen Loss: 3.8824, Disc Loss: 0.2678, Accuracy: 1.0000
Step  300/597 - Gen Loss: 4.3600, Disc Loss: 0.8350, Accuracy: 1.0000
Step  400/597 - Gen Loss: 6.0195, Disc Loss: 0.2515, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.3182, Disc Loss: 0.1601, Accuracy: 1.0000

Epoch 26
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.0395, Disc Loss: 0.3819, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.6228, Disc Loss: 0.2184, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.5952, Disc Loss: 0.5731, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.8860, Disc Loss: 0.1468, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.3555, Disc Loss: 0.1786, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.2297, Disc Loss: 0.4617, Accuracy: 1.0000

Epoch 27
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.9118, Disc Loss: 0.4352, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.8890, Disc Loss: 0.5551, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.4782, Disc Loss: 0.4880, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.1287, Disc Loss: 0.4523, Accuracy: 0.9844
Step  400/597 - Gen Loss: 2.2108, Disc Loss: 0.2902, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0849, Disc Loss: 0.4252, Accuracy: 1.0000

Epoch 28
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.3963, Disc Loss: 0.2859, Accuracy: 1.0000
Step  100/597 - Gen Loss: 0.9649, Disc Loss: 0.4041, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.6064, Disc Loss: 0.3680, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.6205, Disc Loss: 0.5519, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.4772, Disc Loss: 0.4349, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.6317, Disc Loss: 0.3333, Accuracy: 1.0000

Epoch 29
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.0076, Disc Loss: 0.2733, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.8387, Disc Loss: 0.3260, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.6398, Disc Loss: 0.3752, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.7326, Disc Loss: 0.5486, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.2567, Disc Loss: 0.3874, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.7250, Disc Loss: 0.1016, Accuracy: 1.0000

Epoch 30
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.9994, Disc Loss: 0.5309, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.7908, Disc Loss: 0.1697, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.7011, Disc Loss: 0.3897, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.2541, Disc Loss: 0.3925, Accuracy: 1.0000
Step  400/597 - Gen Loss: 0.7985, Disc Loss: 0.2548, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.0496, Disc Loss: 0.3962, Accuracy: 1.0000

Epoch 31
------------------------------------------------------------
Step    0/597 - Gen Loss: 4.4182, Disc Loss: 0.0808, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.4641, Disc Loss: 0.3280, Accuracy: 1.0000
Step  200/597 - Gen Loss: 3.2785, Disc Loss: 0.3043, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.7168, Disc Loss: 0.6540, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.8257, Disc Loss: 0.3662, Accuracy: 1.0000
Step  500/597 - Gen Loss: 0.8125, Disc Loss: 0.5467, Accuracy: 1.0000

Epoch 32
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.6745, Disc Loss: 0.2369, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.9367, Disc Loss: 0.2217, Accuracy: 0.9844
Step  200/597 - Gen Loss: 2.9339, Disc Loss: 0.4855, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.7845, Disc Loss: 0.9495, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.6571, Disc Loss: 0.2359, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.3393, Disc Loss: 0.1455, Accuracy: 1.0000

Epoch 33
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.1996, Disc Loss: 0.3274, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.8820, Disc Loss: 0.4202, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.9712, Disc Loss: 0.0576, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.7701, Disc Loss: 0.2573, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.5117, Disc Loss: 0.4496, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.7163, Disc Loss: 0.3626, Accuracy: 1.0000

Epoch 34
------------------------------------------------------------
Step    0/597 - Gen Loss: 0.9602, Disc Loss: 0.2358, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.1833, Disc Loss: 0.4677, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.6708, Disc Loss: 0.1428, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.7735, Disc Loss: 0.5486, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.4601, Disc Loss: 0.0930, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.3701, Disc Loss: 0.4082, Accuracy: 1.0000

Epoch 35
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.2349, Disc Loss: 0.4788, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.2050, Disc Loss: 0.4480, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.7486, Disc Loss: 0.6385, Accuracy: 0.9844
Step  300/597 - Gen Loss: 2.1610, Disc Loss: 0.3372, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.1572, Disc Loss: 0.8741, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.9842, Disc Loss: 0.2435, Accuracy: 0.9844

Epoch 36
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.6373, Disc Loss: 0.3867, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.5811, Disc Loss: 0.4889, Accuracy: 1.0000
Step  200/597 - Gen Loss: 3.0635, Disc Loss: 0.1913, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.6800, Disc Loss: 0.1602, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.6937, Disc Loss: 0.4480, Accuracy: 0.9844
Step  500/597 - Gen Loss: 3.0612, Disc Loss: 0.4603, Accuracy: 1.0000

Epoch 37
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.9426, Disc Loss: 0.3390, Accuracy: 0.9844
Step  100/597 - Gen Loss: 1.6300, Disc Loss: 0.4150, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.3517, Disc Loss: 0.2819, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.3280, Disc Loss: 0.1478, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.5800, Disc Loss: 0.2448, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.5525, Disc Loss: 0.2101, Accuracy: 1.0000

Epoch 38
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.4270, Disc Loss: 0.5425, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.4207, Disc Loss: 0.3367, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.1398, Disc Loss: 0.4275, Accuracy: 1.0000
Step  300/597 - Gen Loss: 4.0988, Disc Loss: 0.2502, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.2271, Disc Loss: 0.1395, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.6671, Disc Loss: 0.3743, Accuracy: 1.0000

Epoch 39
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.7588, Disc Loss: 0.6776, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.1298, Disc Loss: 0.2406, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.7701, Disc Loss: 0.2405, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.4563, Disc Loss: 0.8833, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.7922, Disc Loss: 0.0941, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.8470, Disc Loss: 0.5225, Accuracy: 1.0000

Epoch 40
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.7720, Disc Loss: 0.2658, Accuracy: 0.9844
Step  100/597 - Gen Loss: 2.2273, Disc Loss: 0.5104, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.5397, Disc Loss: 0.2931, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.4776, Disc Loss: 0.4255, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.1666, Disc Loss: 0.2591, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.7052, Disc Loss: 0.1322, Accuracy: 1.0000

Epoch 41
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.2224, Disc Loss: 0.1575, Accuracy: 1.0000
Step  100/597 - Gen Loss: 3.5117, Disc Loss: 0.5867, Accuracy: 0.9844
Step  200/597 - Gen Loss: 3.6945, Disc Loss: 0.1202, Accuracy: 1.0000
Step  300/597 - Gen Loss: 3.1906, Disc Loss: 0.2600, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.1727, Disc Loss: 0.4048, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.3958, Disc Loss: 0.2649, Accuracy: 1.0000

Epoch 42
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.9941, Disc Loss: 0.2056, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.9017, Disc Loss: 0.2723, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.5867, Disc Loss: 0.2939, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.3508, Disc Loss: 0.3027, Accuracy: 0.9844
Step  400/597 - Gen Loss: 1.7040, Disc Loss: 0.4114, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.1458, Disc Loss: 0.2050, Accuracy: 1.0000

Epoch 43
------------------------------------------------------------
Step    0/597 - Gen Loss: 0.6283, Disc Loss: 0.6908, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.4713, Disc Loss: 0.4245, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.2385, Disc Loss: 0.3767, Accuracy: 1.0000
Step  300/597 - Gen Loss: 1.4906, Disc Loss: 0.4895, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.8334, Disc Loss: 0.3927, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.3340, Disc Loss: 0.3113, Accuracy: 1.0000

Epoch 44
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.4623, Disc Loss: 0.3484, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.5022, Disc Loss: 0.5202, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.6121, Disc Loss: 0.2282, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.3901, Disc Loss: 0.1044, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.4374, Disc Loss: 0.2284, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.6515, Disc Loss: 0.1743, Accuracy: 1.0000

Epoch 45
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.3866, Disc Loss: 0.3291, Accuracy: 1.0000
Step  100/597 - Gen Loss: 4.3039, Disc Loss: 0.1775, Accuracy: 0.9844
Step  200/597 - Gen Loss: 2.4097, Disc Loss: 0.2491, Accuracy: 1.0000
Step  300/597 - Gen Loss: 3.6484, Disc Loss: 0.2165, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.9876, Disc Loss: 0.1182, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.7940, Disc Loss: 0.1993, Accuracy: 1.0000

Epoch 46
------------------------------------------------------------
Step    0/597 - Gen Loss: 3.5496, Disc Loss: 0.2473, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.2197, Disc Loss: 0.6030, Accuracy: 0.9844
Step  200/597 - Gen Loss: 1.5221, Disc Loss: 0.7216, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.4736, Disc Loss: 0.3929, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.8718, Disc Loss: 0.3118, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.8517, Disc Loss: 0.4374, Accuracy: 1.0000

Epoch 47
------------------------------------------------------------
Step    0/597 - Gen Loss: 2.4613, Disc Loss: 0.3245, Accuracy: 1.0000
Step  100/597 - Gen Loss: 1.4967, Disc Loss: 0.3137, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.6631, Disc Loss: 0.4794, Accuracy: 1.0000
Step  300/597 - Gen Loss: 3.5726, Disc Loss: 0.4046, Accuracy: 1.0000
Step  400/597 - Gen Loss: 1.8450, Disc Loss: 0.1957, Accuracy: 0.9844
Step  500/597 - Gen Loss: 2.0811, Disc Loss: 0.5062, Accuracy: 0.9844

Epoch 48
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.2597, Disc Loss: 0.4166, Accuracy: 1.0000
Step  100/597 - Gen Loss: 0.5783, Disc Loss: 0.4595, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.7692, Disc Loss: 0.4126, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.3921, Disc Loss: 0.1368, Accuracy: 1.0000
Step  400/597 - Gen Loss: 2.5431, Disc Loss: 0.5014, Accuracy: 1.0000
Step  500/597 - Gen Loss: 3.2760, Disc Loss: 0.2042, Accuracy: 1.0000

Epoch 49
------------------------------------------------------------
Step    0/597 - Gen Loss: 4.1493, Disc Loss: 0.2174, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.4277, Disc Loss: 0.4108, Accuracy: 1.0000
Step  200/597 - Gen Loss: 1.4193, Disc Loss: 0.3145, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.6866, Disc Loss: 0.1514, Accuracy: 1.0000
Step  400/597 - Gen Loss: 4.2587, Disc Loss: 0.4559, Accuracy: 1.0000
Step  500/597 - Gen Loss: 1.0000, Disc Loss: 0.5030, Accuracy: 1.0000

Epoch 50
------------------------------------------------------------
Step    0/597 - Gen Loss: 1.8456, Disc Loss: 0.3913, Accuracy: 1.0000
Step  100/597 - Gen Loss: 2.1589, Disc Loss: 0.1632, Accuracy: 1.0000
Step  200/597 - Gen Loss: 2.5233, Disc Loss: 0.4711, Accuracy: 1.0000
Step  300/597 - Gen Loss: 2.5479, Disc Loss: 0.8653, Accuracy: 1.0000
Step  400/597 - Gen Loss: 3.4330, Disc Loss: 0.0938, Accuracy: 1.0000
Step  500/597 - Gen Loss: 2.4952, Disc Loss: 0.7561, Accuracy: 0.9844

Generating Grid of Samples at Final Epoch 50:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
Final Generated Samples - All Letter Classes:
Generating 6 samples per class for all 16 letter classes...
No description has been provided for this image
No description has been provided for this image

Observation:¶

  • Generated Samples: Letters appear to be worst than basic CGAN, however there are still some letters that maintain their quality and sharpness
  • Training Losses: Generator loss is highly volatile, indicating unstable adversarial learning, while discriminator loss remains low and stable.
  • Discriminator Accuracy: Large, fluctuating gap between losses shows persistent training imbalance.

HYPERPARAMETER TUNING FOR CGAN (without spectral norm or residual block)¶

  • Concluding with our model training, it appears that our Baseline CGAN does the best at generating the letters, however there are some letters that the model still finds ambiguous
  • Using 'eye-power' we observe that our model is still unsure about letters that have extreme differences from their capital form.
  • However, we can try to rectify this by hyperparameter tuning our model to try and get our images sharper and more accurate.
  • As we can see from these images for our baseline CGAN:

image.png image-2.png

What we are going to do and why:¶

  • From what we can see in the Discriminator Accuracy graph, it appears that our discriminator is overpowering our generator, starving it of gradient
  • Images look okay for many classes, but the stroke sharpness/detail plateaus, thus regularization can be improved.
In [ ]:
# =============================================================================
# HYPERPARAMETER TUNING - COMPLETE CGAN EXPERIMENT WITH FID & KL TRACKING
# =============================================================================

# Create output directory
OUTPUT_DIR = "cgan_hyperparameter_results"
os.makedirs(OUTPUT_DIR, exist_ok=True)

# Build CGAN models for hyperparameter tuning (same as original)
def build_cgan_generator_hp(latent_dim=100, num_classes=16, img_height=28, img_width=28):
    """Build CGAN generator identical to original architecture"""
    noise_input = tf.keras.layers.Input(shape=(latent_dim,))
    label_input = tf.keras.layers.Input(shape=(), dtype='int32')
    
    # Label embedding
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Combine noise and label
    combined_input = tf.keras.layers.Concatenate()([noise_input, label_embedding])
    
    # Dense layers
    x = tf.keras.layers.Dense(7 * 7 * 256, use_bias=False)(combined_input)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    x = tf.keras.layers.Reshape((7, 7, 256))(x)
    
    # First upsampling: 7x7 -> 14x14
    x = tf.keras.layers.Conv2DTranspose(128, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    
    # Second upsampling: 14x14 -> 28x28
    x = tf.keras.layers.Conv2DTranspose(64, 5, strides=2, padding='same', use_bias=False)(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU()(x)
    
    # Final layer
    output = tf.keras.layers.Conv2DTranspose(1, 5, strides=1, padding='same', use_bias=False, activation='tanh')(x)
    
    return tf.keras.Model([noise_input, label_input], output, name='cgan_generator_hp')

def build_cgan_discriminator_hp(img_height=28, img_width=28, num_classes=16):
    """Build CGAN discriminator identical to original architecture"""
    img_input = tf.keras.layers.Input(shape=(img_height, img_width, 1))
    label_input = tf.keras.layers.Input(shape=(), dtype='int32')
    
    # Image processing
    x = tf.keras.layers.Conv2D(64, 5, strides=2, padding='same')(img_input)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    x = tf.keras.layers.Conv2D(128, 5, strides=2, padding='same')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.3)(x)
    
    x = tf.keras.layers.Flatten()(x)
    
    # Label embedding
    label_embedding = tf.keras.layers.Embedding(num_classes, 50)(label_input)
    label_embedding = tf.keras.layers.Flatten()(label_embedding)
    
    # Combine features
    x = tf.keras.layers.Concatenate()([x, label_embedding])
    
    # Final layers
    x = tf.keras.layers.Dense(1024)(x)
    x = tf.keras.layers.LeakyReLU(alpha=0.2)(x)
    x = tf.keras.layers.Dropout(0.5)(x)

    validity = tf.keras.layers.Dense(1, activation='sigmoid', name='validity')(x)
    label_pred = tf.keras.layers.Dense(num_classes, activation='softmax', name='label_pred')(x)
    return tf.keras.Model([img_input, label_input], [validity, label_pred], name='cgan_discriminator_hp')


# ====================== Metric Helpers ======================
def _timestamp():
    return datetime.datetime.now().strftime("%Y%m%d_%H%M%S")

def _to_255_rgb(imgs):
    """imgs: (N,H,W,1) float in [-1,1] or [0,1] or [0,255]; returns uint8 RGB 299x299."""
    x = imgs
    x = np.asarray(x, dtype=np.float32)
    vmin, vmax = np.min(x), np.max(x)
    # map to [0,255]
    if vmax <= 1.0 and vmin >= -1.0:
        # assume [-1,1] or [0,1]
        if vmin >= 0.0:
            x = x * 255.0
        else:
            x = (x + 1.0) * 127.5
    # if already >1, assume [0,255]
    x = np.clip(x, 0.0, 255.0)
    # grayscale -> rgb
    x = np.repeat(x, 3, axis=-1)
    # resize to 299
    x = tf.image.resize(x, (299, 299), method='bilinear').numpy()
    return x.astype(np.float32)

def _inception_activations_uint8(images_uint8_rgb, batch_size=64):
    """images should be float32 0..255 RGB 299x299 - process in batches to avoid OOM"""
    n_images = images_uint8_rgb.shape[0]
    all_features = []
    
    # Process in smaller batches to avoid GPU memory issues
    for i in range(0, n_images, batch_size):
        batch_end = min(i + batch_size, n_images)
        batch_images = images_uint8_rgb[i:batch_end]
        
        x = preprocess_input(batch_images)  # scales to [-1,1]
        batch_feats = INCEPTION(x, training=False).numpy()
        all_features.append(batch_feats)
    
    # Concatenate all batch results
    feats = np.concatenate(all_features, axis=0)
    return feats

def _fid_from_feats(f1, f2, eps=1e-6):
    mu1, mu2 = f1.mean(axis=0), f2.mean(axis=0)
    s1, s2 = np.cov(f1, rowvar=False), np.cov(f2, rowvar=False)
    covmean, _ = sqrtm((s1 @ s2).astype(np.float64), disp=False)
    if not np.isfinite(covmean).all():
        # add a small offset to the diagonal if covs are singular
        offset = np.eye(s1.shape[0]) * eps
        covmean, _ = sqrtm(((s1 + offset) @ (s2 + offset)).astype(np.float64), disp=False)
    # numerical cleanup
    if np.iscomplexobj(covmean):
        covmean = covmean.real
    diff = mu1 - mu2
    fid = diff @ diff + np.trace(s1 + s2 - 2.0 * covmean)
    return float(fid)

def compute_fid(real_imgs, fake_imgs):
    r = _to_255_rgb(real_imgs)
    f = _to_255_rgb(fake_imgs)
    rf = _inception_activations_uint8(r)
    ff = _inception_activations_uint8(f)
    return _fid_from_feats(rf, ff)

def compute_kl_label_divergence(real_labels, gen_label_probs, num_classes):
    """
    KL(P_real || Q_gen), where:
      P_real: empirical distribution of real labels in the sample
      Q_gen: average predicted class probabilities from discriminator on generated images
    """
    # P_real
    counts = np.bincount(real_labels.astype(int), minlength=num_classes).astype(np.float64)
    P = counts / max(1, counts.sum())
    # Q_gen
    Q = np.mean(gen_label_probs, axis=0).astype(np.float64)
    # add eps for stability
    eps = 1e-8
    P = np.clip(P, eps, 1.0)
    Q = np.clip(Q, eps, 1.0)
    return float(np.sum(P * np.log(P / Q)))


# ============== Trainer with per-epoch FID & KL tracking ==============
class CGANHyperparameterTrainer:
    def __init__(self, generator, discriminator, latent_dim=100, num_classes=16, 
                 gen_lr=1e-4, disc_lr=1e-4,
                 metric_real_images=None, metric_real_labels=None, metric_sample_size=256):
        self.generator = generator
        self.discriminator = discriminator
        self.latent_dim = latent_dim
        self.num_classes = num_classes

        self.gen_optimizer = tf.keras.optimizers.Adam(gen_lr)
        self.disc_optimizer = tf.keras.optimizers.Adam(disc_lr)

        self.bce_loss = tf.keras.losses.BinaryCrossentropy(from_logits=False)
        self.ce_loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)

        self.gen_loss_metric = tf.keras.metrics.Mean(name='gen_loss')
        self.disc_loss_metric = tf.keras.metrics.Mean(name='disc_loss')
        self.acc_metric = tf.keras.metrics.SparseCategoricalAccuracy(name='label_accuracy')

        self.history = {
            'gen_loss': [], 'disc_loss': [], 'label_accuracy': [],
            'fid': [], 'kl': [], 'epoch': []
        }

        # metric data (fixed subset for reproducibility & speed)
        if metric_real_images is not None and metric_real_labels is not None:
            m = min(metric_sample_size, len(metric_real_images))
            idx = np.random.RandomState(123).choice(len(metric_real_images), size=m, replace=False)
            self.metric_real_images = metric_real_images[idx]
            self.metric_real_labels = metric_real_labels[idx]
        else:
            self.metric_real_images = None
            self.metric_real_labels = None

    @tf.function
    def train_step(self, real_images, real_labels):
        batch_size = tf.shape(real_images)[0]
        # ----- Train D -----
        noise = tf.random.normal([batch_size, self.latent_dim])
        fake_labels = tf.random.uniform([batch_size], minval=0, maxval=self.num_classes, dtype=tf.int32)
        with tf.GradientTape() as tape:
            fake_images = self.generator([noise, fake_labels], training=True)
            real_validity, real_class_logits = self.discriminator([real_images, real_labels], training=True)
            fake_validity, fake_class_logits = self.discriminator([fake_images, fake_labels], training=True)

            real_loss = self.bce_loss(tf.ones_like(real_validity), real_validity)
            fake_loss = self.bce_loss(tf.zeros_like(fake_validity), fake_validity)
            validity_loss = real_loss + fake_loss

            real_class_loss = self.ce_loss(real_labels, real_class_logits)
            fake_class_loss = self.ce_loss(fake_labels, fake_class_logits)
            class_loss = 0.5 * (real_class_loss + fake_class_loss)

            disc_loss = validity_loss + class_loss

        grads = tape.gradient(disc_loss, self.discriminator.trainable_variables)
        self.disc_optimizer.apply_gradients(zip(grads, self.discriminator.trainable_variables))
        self.acc_metric.update_state(real_labels, real_class_logits)

        # ----- Train G -----
        noise = tf.random.normal([batch_size, self.latent_dim])
        gen_labels = tf.random.uniform([batch_size], minval=0, maxval=self.num_classes, dtype=tf.int32)
        with tf.GradientTape() as tape:
            gen_imgs = self.generator([noise, gen_labels], training=True)
            validity, class_logits = self.discriminator([gen_imgs, gen_labels], training=True)
            adv_loss = self.bce_loss(tf.ones_like(validity), validity)
            aux_loss = self.ce_loss(gen_labels, class_logits)
            gen_loss = adv_loss + aux_loss

        grads = tape.gradient(gen_loss, self.generator.trainable_variables)
        self.gen_optimizer.apply_gradients(zip(grads, self.generator.trainable_variables))

        self.gen_loss_metric.update_state(gen_loss)
        self.disc_loss_metric.update_state(disc_loss)

    def _compute_epoch_metrics(self, sample_gen=256):
        if self.metric_real_images is None:
            # metrics disabled if no real sample provided
            self.history['fid'].append(float('nan'))
            self.history['kl'].append(float('nan'))
            return

        m = min(sample_gen, len(self.metric_real_images))
        # sample a fixed slice (first m) for deterministic behavior
        real_imgs = self.metric_real_images[:m]
        real_labels = self.metric_real_labels[:m].astype(np.int32)

        # generate fake images with labels sampled to match real label distribution
        # (fallback to uniform if desired)
        noise = tf.random.normal([m, self.latent_dim])
        fake_labels = tf.convert_to_tensor(real_labels, dtype=tf.int32)
        fake_imgs = self.generator([noise, fake_labels], training=False).numpy()

        # ----- FID -----
        fid = compute_fid(real_imgs, fake_imgs)

        # ----- KL (label distribution): use discriminator predictions on fakes -----
        _, gen_label_probs = self.discriminator([fake_imgs, fake_labels], training=False)
        gen_label_probs = gen_label_probs.numpy()
        kl = compute_kl_label_divergence(real_labels, gen_label_probs, self.num_classes)

        self.history['fid'].append(fid)
        self.history['kl'].append(kl)

    def train_epoch(self, dataset, epoch, steps_per_epoch, verbose=True):
        if verbose:
            print(f"\nEpoch {epoch + 1}")
        self.gen_loss_metric.reset_states()
        self.disc_loss_metric.reset_states()
        self.acc_metric.reset_states()

        for step, (images, labels) in enumerate(dataset.take(steps_per_epoch)):
            self.train_step(images, labels)

        # record losses & accuracy
        self.history['gen_loss'].append(float(self.gen_loss_metric.result()))
        self.history['disc_loss'].append(float(self.disc_loss_metric.result()))
        self.history['label_accuracy'].append(float(self.acc_metric.result()))
        self.history['epoch'].append(epoch)

        # compute per-epoch FID & KL
        self._compute_epoch_metrics()

    def evaluate_performance(self):
        final_gen_loss = self.history['gen_loss'][-1] if self.history['gen_loss'] else float('inf')
        final_disc_loss = self.history['disc_loss'][-1] if self.history['disc_loss'] else float('inf')
        final_accuracy = self.history['label_accuracy'][-1] if self.history['label_accuracy'] else 0.0
        loss_stability = 1 / (1 + np.std(self.history['gen_loss'][-10:] or [0]) + np.std(self.history['disc_loss'][-10:] or [0]))
        score = 0.4 * (1 / (1 + final_gen_loss)) + 0.3 * (1 / (1 + final_disc_loss)) + 0.2 * final_accuracy + 0.1 * loss_stability
        return score


# ============== Visualization & Saving ==============
def save_and_show_generated_grid(generator, num_classes, samples_per_class, experiment_id, class_to_letter=None):
    spc = samples_per_class
    total = num_classes * spc
    rng = tf.random.Generator.from_seed(1234 + int(experiment_id))
    noise = rng.normal(shape=[total, 100])

    labels_list = np.concatenate([np.full(spc, c, dtype=np.int32) for c in range(num_classes)], axis=0)
    labels = tf.constant(labels_list, dtype=tf.int32)

    imgs = generator([noise, labels], training=False)
    imgs = (imgs + 1.0) / 2.0
    imgs = imgs.numpy()

    fig, axes = plt.subplots(spc, num_classes, figsize=(num_classes * 1.6, spc * 1.6))
    fig.suptitle(f"Experiment {experiment_id} — {spc} samples/class", fontsize=14, fontweight='bold')

    for col in range(num_classes):
        header = f"{class_to_letter.get(col, str(col))} (Class {col})" if class_to_letter else f"Class {col}"
        axes[0, col].set_title(header, fontsize=9)
        for row in range(spc):
            idx = col * spc + row
            ax = axes[row, col]
            ax.imshow(imgs[idx, :, :, 0], cmap='gray', vmin=0, vmax=1)
            ax.axis('off')

    plt.tight_layout()
    png_path = os.path.join(OUTPUT_DIR, f"cgan_exp{experiment_id:02d}_samples_{_timestamp()}.png")
    fig.savefig(png_path, dpi=200, bbox_inches='tight')
    plt.show()
    plt.close(fig)
    return png_path

def plot_and_save_metrics(history, experiment_id):
    epochs = np.array(history['epoch'])
    # align metric arrays to same length
    gl = np.array(history['gen_loss'])
    dl = np.array(history['disc_loss'])
    acc = np.array(history['label_accuracy'])
    fid = np.array(history['fid'])
    kl = np.array(history['kl'])

    fig = plt.figure(figsize=(15, 4.5))

    ax1 = plt.subplot(1, 3, 1)
    ax1.plot(epochs, gl, label='Gen Loss', linewidth=2)
    ax1.plot(epochs, dl, label='Disc Loss', linewidth=2)
    ax1.set_title(f'Exp {experiment_id} — Loss per Epoch')
    ax1.set_xlabel('Epoch'); ax1.set_ylabel('Loss'); ax1.legend(); ax1.grid(alpha=0.3)

    ax2 = plt.subplot(1, 3, 2)
    ax2.plot(epochs, fid, label='FID', linewidth=2)
    ax2.set_title(f'Exp {experiment_id} — FID per Epoch')
    ax2.set_xlabel('Epoch'); ax2.set_ylabel('FID'); ax2.grid(alpha=0.3)

    ax3 = plt.subplot(1, 3, 3)
    ax3.plot(epochs, kl, label='KL(P_real || Q_gen)', linewidth=2)
    ax3.set_title(f'Exp {experiment_id} — KL per Epoch')
    ax3.set_xlabel('Epoch'); ax3.set_ylabel('KL'); ax3.grid(alpha=0.3)

    plt.tight_layout()
    metrics_path = os.path.join(OUTPUT_DIR, f"cgan_exp{experiment_id:02d}_metrics_{_timestamp()}.png")
    plt.savefig(metrics_path, dpi=200, bbox_inches='tight')
    plt.show()
    plt.close(fig)
    return metrics_path


# ============== Experiment Runner (now also plots metrics) ==============
def run_hyperparameter_experiment(gen_lr, disc_lr, batch_size, epochs, experiment_id, class_to_letter=None):
    print(f"Running Experiment {experiment_id}: gen_lr={gen_lr}, disc_lr={disc_lr}, batch_size={batch_size}, epochs={epochs}")
    
    generator = build_cgan_generator_hp(latent_dim=100, num_classes=num_classes, img_height=28, img_width=28)
    discriminator = build_cgan_discriminator_hp(img_height=28, img_width=28, num_classes=num_classes)
    
    # Trainer now receives a fixed subset of real data for metrics
    trainer = CGANHyperparameterTrainer(
        generator=generator,
        discriminator=discriminator,
        latent_dim=100,
        num_classes=num_classes,
        gen_lr=gen_lr,
        disc_lr=disc_lr,
        metric_real_images=X_train,          # uses an internal fixed subset for speed
        metric_real_labels=y_train,
        metric_sample_size=256  # Reduced from 1024 to 256 to avoid GPU memory issues
    )
    
    dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
    dataset = dataset.shuffle(BUFFER_SIZE).batch(batch_size, drop_remainder=True).prefetch(tf.data.AUTOTUNE)
    steps_per_epoch = len(X_train) // batch_size
    
    for epoch in range(epochs):
        trainer.train_epoch(dataset, epoch, steps_per_epoch, verbose=False)

    # ---- After last epoch: 10 samples/class (show + save) ----
    grid_path = save_and_show_generated_grid(
        generator=generator,
        num_classes=num_classes,
        samples_per_class=10,
        experiment_id=experiment_id,
        class_to_letter=class_to_letter
    )
    print(f"[Experiment {experiment_id}] Saved sample grid to: {grid_path}")

    # ---- Plot per-epoch Loss, FID, KL (show + save) ----
    metrics_path = plot_and_save_metrics(trainer.history, experiment_id)
    print(f"[Experiment {experiment_id}] Saved metrics plot to: {metrics_path}")

    # ---- Save weights for this experiment ----
    gen_w_path = os.path.join(OUTPUT_DIR, f"cgan_exp{experiment_id:02d}_generator_{_timestamp()}.h5")
    disc_w_path = os.path.join(OUTPUT_DIR, f"cgan_exp{experiment_id:02d}_discriminator_{_timestamp()}.h5")
    generator.save_weights(gen_w_path)
    discriminator.save_weights(disc_w_path)
    print(f"[Experiment {experiment_id}] Saved weights: {gen_w_path}")
    print(f"[Experiment {experiment_id}] Saved weights: {disc_w_path}")
    
    score = trainer.evaluate_performance()
    result = {
        'experiment_id': experiment_id,
        'gen_lr': gen_lr,
        'disc_lr': disc_lr,
        'batch_size': batch_size,
        'epochs': epochs,
        'score': score,
        'final_gen_loss': trainer.history['gen_loss'][-1],
        'final_disc_loss': trainer.history['disc_loss'][-1],
        'final_accuracy': trainer.history['label_accuracy'][-1],
        'generator': generator,
        'discriminator': discriminator,
        'trainer': trainer,
        'grid_path': grid_path,
        'metrics_path': metrics_path,
        'gen_w_path': gen_w_path,
        'disc_w_path': disc_w_path
    }
    print(f"Experiment {experiment_id} completed with score: {score:.4f}")
    return result



# =================== Hyperparameter Sweep ===================
hyperparameter_combinations = [
    (0.0002, 0.0002, 64, 100),
]

results = []
for i, (gen_lr, disc_lr, batch_size, epochs) in enumerate(hyperparameter_combinations, 1):
    result = run_hyperparameter_experiment(gen_lr, disc_lr, batch_size, epochs, i, class_to_letter=class_to_letter)
    results.append(result)

best_result = max(results, key=lambda x: x['score'])
print(f"\nBest hyperparameters:")
print(f"Generator LR: {best_result['gen_lr']}")
print(f"Discriminator LR: {best_result['disc_lr']}")
print(f"Batch Size: {best_result['batch_size']}")
print(f"Score: {best_result['score']:.4f}")

best_generator = best_result['generator']
best_discriminator = best_result['discriminator']

# ============== Save best weights ==============
best_gen_path = os.path.join(OUTPUT_DIR, "best_cgan_generator.h5")
best_disc_path = os.path.join(OUTPUT_DIR, "best_cgan_discriminator.h5")
best_generator.save_weights(best_gen_path)
best_discriminator.save_weights(best_disc_path)
print(f"Saved BEST generator weights to: {best_gen_path}")
print(f"Saved BEST discriminator weights to: {best_disc_path}")
Running Experiment 1: gen_lr=0.0002, disc_lr=0.0002, batch_size=64, epochs=100
No description has been provided for this image
[Experiment 1] Saved sample grid to: cgan_hyperparameter_results\cgan_exp01_samples_20250810_053707.png
No description has been provided for this image
[Experiment 1] Saved metrics plot to: cgan_hyperparameter_results\cgan_exp01_metrics_20250810_053723.png
[Experiment 1] Saved weights: cgan_hyperparameter_results\cgan_exp01_generator_20250810_053726.h5
[Experiment 1] Saved weights: cgan_hyperparameter_results\cgan_exp01_discriminator_20250810_053726.h5
Experiment 1 completed with score: 0.5934

Best hyperparameters:
Generator LR: 0.0002
Discriminator LR: 0.0002
Batch Size: 64
Score: 0.5934
Saved BEST generator weights to: cgan_hyperparameter_results\best_cgan_generator.h5
Saved BEST discriminator weights to: cgan_hyperparameter_results\best_cgan_discriminator.h5

7. Project Conclusions and Future Directions¶

7.1 Comprehensive Results Summary¶

Key Achievements and Findings¶

This comprehensive implementation and analysis of Generative Adversarial Networks for EMNIST letter generation has yielded significant insights into the practical challenges and solutions in generative modeling for handwritten character synthesis.

Technical Accomplishments¶

Dataset Analysis and Preprocessing:

  • Comprehensive EDA: Conducted thorough exploratory data analysis revealing critical dataset characteristics
  • Orientation Correction: Successfully identified and corrected systematic image orientation issues inherited from NIST database format
  • Statistical Insights: Developed deep understanding of class imbalances, feature importance, and structural complexity across letter classes
  • Quality Assessment: Implemented robust data quality validation and preprocessing pipelines

Multiple GAN Architecture Implementation:

  • Baseline DCGAN: Established fundamental benchmark with standard architecture
  • Enhanced DCGAN: Implemented advanced techniques including spectral normalization, residual blocks, and self-attention
  • WGAN Variants: Explored Wasserstein GAN formulations with gradient penalty for improved training stability
  • Conditional GANs: Developed both baseline and enhanced conditional generation architectures

Training Strategy Innovation:

  • Balanced Training: Implemented sophisticated techniques to prevent discriminator dominance
  • Advanced Optimization: Applied label smoothing, noise injection, and frequency control strategies
  • Comprehensive Evaluation: Developed multi-metric evaluation framework including FID, KL divergence, and visual quality assessment

Performance Insights and Comparative Analysis¶

Architecture Performance Hierarchy:

  1. Baseline CGAN: Demonstrated superior practical performance with recognizable letter generation
  2. Enhanced Techniques: Revealed complexity-performance trade-offs requiring careful balance
  3. WGAN Variants: Highlighted challenges in applying Wasserstein formulations to discrete character data
  4. Training Balance: Confirmed critical importance of adversarial balance over architectural complexity

Critical Success Factors:

  • Simplicity Over Complexity: Baseline architectures often outperformed enhanced variants
  • Training Balance: Discriminator-generator balance more important than architectural sophistication
  • Data Understanding: Deep dataset analysis crucial for informed architectural decisions
  • Evaluation Rigor: Multi-faceted evaluation essential for meaningful performance assessment

Lessons Learned and Technical Insights¶

GAN Training Dynamics¶

Discriminator Dominance Challenge: Our extensive experimentation consistently revealed the critical challenge of discriminator dominance across multiple architectures. This finding emphasizes that successful GAN training requires careful balance rather than simply powerful architectures.

Complexity Trade-offs: Enhanced techniques (spectral normalization, self-attention, residual blocks) showed mixed results, often adding computational complexity without proportional quality improvements. This highlights the importance of selective enhancement based on specific dataset characteristics.

Conditional Generation Effectiveness: Conditional GANs demonstrated clear advantages for controlled letter generation, providing both better class fidelity and more stable training dynamics compared to unconditional variants.

Dataset-Specific Considerations¶

Mixed Case Challenges: The EMNIST dataset's inclusion of both uppercase and lowercase letters in single classes created significant training challenges, particularly affecting letters with dramatically different case forms (A/a, G/g, Q/q).

Orientation and Quality Issues: Historical data collection methodology introduced systematic biases requiring careful preprocessing and quality control measures.

Class Imbalance Impact: Uneven class distributions necessitated careful sampling strategies and evaluation approaches to ensure fair performance assessment across all letter classes.

7.2 Future Research Directions¶

Immediate Enhancement Opportunities¶

Advanced Conditioning Strategies:

  • Explicit Case Control: Implement separate conditioning for uppercase/lowercase variants
  • Style Transfer: Develop mechanisms for consistent handwriting style generation
  • Progressive Training: Explore curriculum learning approaches for improved convergence

Architectural Innovations:

  • Attention Mechanisms: Investigate more sophisticated attention designs for character structure
  • Progressive Growing: Adapt progressive GAN techniques for character generation
  • Hybrid Approaches: Combine generative and retrieval methods for improved quality

Long-term Research Trajectories¶

Multi-Modal Generation:

  • Cross-Language Support: Extend to multiple alphabetic and non-alphabetic writing systems
  • Style Diversification: Generate diverse handwriting styles within single character classes
  • Context-Aware Generation: Develop word and sentence-level coherent handwriting synthesis

Advanced Evaluation Framework:

  • Perceptual Metrics: Develop human-perception-aligned quality assessment methods
  • Functional Evaluation: Assess generated characters through downstream recognition tasks
  • Robustness Testing: Evaluate generation quality across diverse input conditions

Real-World Applications:

  • Data Augmentation: Apply generated characters for handwriting recognition training
  • Accessibility Tools: Develop assistive technologies for handwriting synthesis
  • Digital Forensics: Investigate applications in handwriting analysis and verification

7.3 Technical Contributions and Significance¶

Novel Methodological Contributions¶

Comprehensive Evaluation Framework: Our multi-faceted evaluation approach combining quantitative metrics (FID, KL divergence) with qualitative visual assessment provides a robust template for generative model evaluation in character synthesis tasks.

Training Balance Techniques: The systematic investigation of discriminator dominance mitigation strategies offers practical guidance for GAN training in discrete generation tasks.

Dataset Analysis Methodology: Our thorough exploratory data analysis workflow, including orientation correction and quality assessment, provides a replicable framework for similar historical dataset preprocessing.

Broader Impact and Applications¶

Educational Value: This comprehensive implementation serves as an educational resource demonstrating practical GAN development challenges and solutions.

Research Foundation: The systematic comparison of multiple GAN variants provides empirical foundation for future handwriting synthesis research.

Methodological Framework: Our balanced approach to architecture complexity vs. training optimization offers guidance for similar generative modeling projects.

7.4 Final Recommendations¶

For Practitioners¶

Start Simple: Begin with baseline architectures before adding complexity Focus on Balance: Prioritize training balance over architectural sophistication Evaluate Comprehensively: Use multiple metrics and visual assessment for performance evaluation Understand Data: Invest significant effort in dataset analysis and preprocessing

For Researchers¶

Investigate Fundamentals: Focus on understanding core GAN dynamics before architectural enhancement Develop Better Metrics: Create evaluation methods that align with human perception and practical utility Address Real Challenges: Target specific application requirements rather than general quality improvement Collaborate Across Domains: Combine generative modeling expertise with domain-specific knowledge

This project demonstrates that successful generative modeling requires careful balance of theoretical understanding, practical implementation skills, and domain-specific knowledge. While advanced techniques offer potential improvements, fundamental training dynamics and data understanding remain the primary determinants of success in handwritten character generation tasks.

CONCLUSIONS¶

By comparing all our models, it appears that CGAN is inherently superior in learning and generating handwritten letters. Looking from our basic DCGAN (our baseline model) we can safely choose CGAN as our best model yet.¶

  • However, there are a few caviats. Such as the difference in complexity spread across the different classes.
  • Classes that are sharp and clearer in generating tend to have fewer strokes such as 'I, L, O'. And since they have less complexity and do not differ that much from their capital form, they are able to be reproduced accurately.
  • For classes that have multiple strokes and have significantly different capital forms like 'G, A, N' (get it), they are harder to generate and tend to confuse our model. Leading it to have bad generated results.

Generating coloured images vs. black-and-white Images¶

  • Generating coloured images is generally more challenging than generating black-and-white ones.

  • Complexity: Colour images have three channels (RGB) instead of one, increasing the data dimensionality and the amount of information the GAN must learn.

  • Model Size & Resources – The generator and discriminator need more parameters to process the extra channels, resulting in higher memory use, longer training times, and greater computational demand.

  • Training Stability – Greater complexity can make training less stable, often requiring more careful hyperparameter tuning and stronger regularization.

  • Quality Metrics – Image evaluation becomes harder since both structure and colour fidelity must be considered.

  • Class Feature Cues – If colours strongly correlate with certain class features, the discriminator may exploit this easily, increasing the risk of mode collapse. Conversely, distinct colour cues could help the generator learn features faster and improve output quality.

8. Project Summary and Acknowledgments¶

Executive Summary¶

This comprehensive research project successfully implemented and evaluated multiple Generative Adversarial Network architectures for handwritten letter generation using the EMNIST Letters dataset. Through systematic exploration of baseline and enhanced GAN variants, we have developed significant insights into the practical challenges and solutions in character-level generative modeling.

Academic Recognition¶

Course Context: This project was completed as part of the Data Analytics and Algorithms (DAAA) program, demonstrating practical application of advanced machine learning techniques to real-world generative modeling challenges.

Closing Statement¶

This comprehensive exploration of Generative Adversarial Networks for handwritten letter generation demonstrates both the potential and challenges in applying advanced generative modeling to practical character synthesis tasks. Through systematic implementation and evaluation of multiple architectures, we have contributed valuable insights to the understanding of GAN training dynamics and performance characteristics in discrete generation domains.

The project's emphasis on thorough experimental methodology, comprehensive documentation, and practical insights reflects our commitment to advancing both theoretical understanding and practical applications in generative modeling. We hope this work serves as a valuable resource for future researchers and practitioners exploring similar challenges in computer vision and generative artificial intelligence.

Authors: Shen Lei & Xavier Lee
Program: Data Analytics and Algorithms (DAAA) - Full-Time Diploma, Class 2B/22
Date: 10 August, 2025
Institution: Singapore Polytechnic